Cikakken Bincike kan Aikace-aikacen Multi-Armed Bandit da Algorithms

Cikakken nazari kan tsarin multi-armed bandit, contextual bandits, da aikace-aikacensu na ainihi a cikin tsarin ba da shawara, gwajin asibiti, da gano abubuwan da ba su dace ba.
Takardun Fasaha | Takardun Bincike | Albarkatun Ilimi

Gabatarwa ga Matsalolin Multi-Armed Bandit

Yawancin aikace-aikace na ainihi suna buƙatar matsalolin yanke shawara na bi da bi inda wakili dole ne ya zaɓi mafi kyawun aiki tsakanin madadin da yawa. Misalan irin waɗannan aikace-aikacen sun haɗa da gwaje-gwajen asibiti, tsarin ba da shawara, da gano abubuwan da ba su dace ba. A wasu lokuta, bayanan bi ko mahallin yana haɗe da kowane aiki (misali, bayanan martaba), kuma martani, ko lada, yana iyakance ga zaɓin da aka zaɓa. Misali, a cikin gwaje-gwajen asibiti, mahallin shine rikodin likitanci na majiyyaci (misali, yanayin lafiya, tarihin iyali, da sauransu), ayyukan sun dace da zaɓin magani da aka kwatanta, kuma lada yana wakiltar sakamakon maganin da aka tsara (misali, nasara ko gazawa). Wani muhimmin al'amari da ke shafar nasara na dogon lokaci a cikin irin waɗannan yanayi shine nemo daidaito mai kyau tsakanin bincike (misali, gwada sabon magani) da amfani (zaɓen sanannen magani har zuwa yau).

Wannan cinikin da ke tattare da bincike da amfani yana wanzuwa a yawancin matsalolin yanke shawara na bi da bi kuma a al'ada an tsara shi azaman matsalar bandit, wanda ke bayyana kamar haka: Idan aka ba da K yiwuwar ayyuka, ko "hannaye", kowanne yana da alaƙa da ƙayyadaddun rarraba lada amma ba a sani ba, a kowane juzu'i, wakili yana zaɓar hannu don yin wasa kuma yana karɓar lada, wanda aka zana daga rarraba yuwuwar hannun daban. Aikin wakili shine koyon zaɓar ayyukansa ta yadda za a iya ƙara lada akan lokaci.

Muhimman Fahimta

  • Dilemma na bincike da amfani yana da mahimmanci ga matsalolin multi-armed bandit
  • Algorithms na Bandit suna ba da tsarin lissafi don daidaita bincike da amfani
  • Contextual bandits sun haɗa da ƙarin bayani don inganta yanke shawara
  • Aikace-aikace na ainihi sun ƙunshi yankuna da yawa ciki har da kiwon lafiya, kasuwanci ta kan layi, da tsaro na kan layi

Tsarin Matsalar Multi-Armed Bandit

An ayyana matsala na al'ada na multi-armed bandit (MAB) ta K hannaye, kowanne yana da rarraba lada da ba a sani ba. A kowane lokaci t, wakili yana zaɓar hannu a_t ∈ {1, 2, ..., K} kuma yana karɓar lada r_t wanda aka zana daga rarraba zaɓin hannun. Manufar ita ce haɓaka lada mai tarawa akan T zagaye, ko daidai, rage nadama, wanda shine bambanci tsakanin lada mai tarawa na mafi kyawun hannu da lada mai tarawa na zaɓin hannaye.

Lura cewa dole ne wakili ya gwada hannaye daban-daban don koyon lada (watau bincika ribar), kuma ya yi amfani da wannan bayanin da aka koya don karɓar mafi kyawun riba (amfani da ribar da aka koya). Akwai ciniki na halitta tsakanin bincike da amfani. Misali, gwada kowane hannu sau ɗaya, sannan kuna kunna mafi kyawun daga cikinsu. Wannan hanyar sau da yawa tana iya haifar da mafi ƙarancin mafita lokacin da lada na hannaye ba su da tabbas.

Tsarin Nadama

Nadama = Σ[μ* - μ_{a_t}] inda μ* shine lada da ake tsammani na mafi kyawun hannu

Ma'auni na Gama gari

Lada mai tarawa, nadama mai sauƙi, da nadama na Bayesian sune mahimman ma'auni na aiki

An gabatar da mafita daban-daban don wannan matsala, dangane da tsari na stochastic da tsarin Bayesian; duk da haka, waɗannan hanyoyin ba su yi la'akari da mahallin ko bayanan bi da ke akwai ga wakili ba.

Contextual Multi-Armed Bandits

Wani sigar MAB mai amfani musamman ita ce contextual multi-arm bandit (CMAB), ko kuma kawai contextual bandit, inda a kowane zagaye, kafin zaɓen hannu, wakili yana lura da vector mahallin x_t wanda zai iya rinjayar rarraba lada na hannaye. Mahallin na iya haɗawa da fasalin mai amfani, masu canjin muhalli, ko kowane bayanin gefe mai dacewa. Manufar ta kasance don haɓaka lada mai tarawa, amma yanzu manufar na iya dogara da mahallin da aka lura.

Contextual bandits sun sami kulawa sosai saboda dacewarsu a cikin tsarin ba da shawara na keɓance, inda mahallin yawanci ke wakiltar halayen mai amfani, kuma hannaye sun dace da abubuwa daban-daban ko abun ciki don ba da shawara. Lada na iya zama danna, siye, ko kowane nau'i na haɗin gwiwa.

An ƙirƙira algorithms da yawa don contextual bandits, ciki har da LinUCB, wanda ke ɗaukar alaƙar layi tsakanin mahallin da lada da ake tsammani na kowane hannu, da samfurin Thompson tare da samfuran layi. Waɗannan algorithms sun nuna ƙwararrun aiki na zahiri a cikin aikace-aikace daban-daban.

Aikace-aikace na Ainihi na Multi-Armed Bandits

Gwaje-gwajen Asibiti

A cikin gwaje-gwajen asibiti, tsarin multi-armed bandit yana ba da hanyar da'a don rarraba magani. Mahallin ya haɗa da bayanan kula da lafiya na majiyyaci, bayanan alƙaluma, da alamomin kwayoyin halitta. Hannaye suna wakiltar zaɓin magani daban-daban, kuma lada yana nuna nasarar magani ko gazawa. Algorithms na Bandit na iya rarraba ƙarin marasa lafiya zuwa magunguna masu ban sha'awa yayin da har yanzu ana bincika madadin, wanda zai iya haifar da mafi kyawun sakamakon majiyyaci da gwaje-gwaje masu inganci.

Tsarin Ba da Shawara

Tsarin ba da shawara yana wakiltar ɗaya daga cikin mafi nasarar aikace-aikace na algorithms na bandit. Manyan dandamali suna amfani da contextual bandits don keɓance abun ciki, samfura, da shawarwarin talla. Bangaren bincike yana ba da damar tsarin gano abubuwan da mai amfani ya fi so don sabbin abubuwa, yayin da amfani yana amfani da sanannun abubuwan da ake so don haɓaka haɗin gwiwar mai amfani. Wannan hanyar tana magance matsalar farawa sanyi don sabbin abubuwa kuma tana daidaitawa da canje-canjen sha'awar mai amfani akan lokaci.

Gano Abubuwan da ba su dace ba

A cikin tsarin gano abubuwan da ba su dace ba, algorithms na Bandit na iya inganta rarraba ƙayyadaddun albarkatun dubawa. Mahallin na iya haɗawa da ma'aunin tsarin, tsarin zirga-zirgar sadarwa, ko fasalin halayen mai amfani. Hannaye suna wakiltar dabarun dubawa daban-daban ko samfurin gano abin da ba daidai ba, kuma lada yana nuna ko an gano abin da ba daidai ba na gaskiya. Wannan hanyar tana ba da damar rarraba albarkatu mai daidaitawa ga mafi kyawun hanyoyin ganowa.

Sauran Aikace-aikace

Ƙarin aikace-aikace sun haɗa da ingantaccen fayil a cikin kuɗi, gwajin A/B a cikin haɓaka gidan yanar gizo, rarraba albarkatun a cikin kwamfutar girgije, da fasahar ilimi don koyo mai daidaitawa. Sassauƙan tsarin bandit yana sa ya dace da kowane yanayi da ke buƙatar yanke shawara na bi da bi a ƙarƙashin rashin tabbas tare da ƙayyadaddun martani.

Algorithms da Hanyoyin Bandit

Bandits na Stochastic

Stochastic bandits suna ɗauka cewa lada na kowane hannu an zana su ne daban daga ƙayyadaddun rarraba. Manyan algorithms sun haɗa da ε-greedy, wanda ke zaɓar mafi kyawun hannu tare da yuwuwar 1-ε da hannu bazuwar tare da yuwuwar ε; Upper Confidence Bound (UCB) algorithms, waɗanda ke zaɓar hannaye dangane da kyakkyawan kiyasin yuwuwar su; da samfurin Thompson, wanda ke amfani da rarraba Bayesian na baya don daidaita bincike da amfani.

Bandits na Adversarial

Adversarial bandits ba sa yin zato na ƙididdiga game da samar da lada, suna ɗaukar su azaman jerin abubuwa da wani abokin gaba zai iya zaɓa. Algorithm na Exp3 da bambance-bambancensa an tsara su don wannan saitin, ta amfani da tsarin ma'auni na exponential don cimma nadama mai ƙasa da layi a kan kowane jerin lada.

Bandits na Bayesian

Bayesian bandits suna kiyaye rarraba yuwuwar akan yiwuwar rarraba lada na hannaye. Samfurin Thompson shine mafi shahararriyar hanyar Bayesian, wanda ke samfurin daga rarraba na baya na sigogin lada na kowane hannu kuma ya zaɓi hannun tare da mafi girman ƙimar samfurin. Wannan yana daidaita bincike da amfani da kyau gwargwadon rashin tabbas na yanzu.

Algorithms na Contextual Bandit

Algorithms na Contextual bandit sun faɗaɗa waɗannan hanyoyin don haɗa bayanan mahallin. LinUCB yana ɗaukar ayyukan lada na layi kuma yana kiyaye ellipsoids na amincewa a kewayen kiyasin sigogi. Neural bandits suna amfani da cibiyoyin sadarwar jijiya masu zurfi don ƙirƙira rikitattun alaƙa tsakanin mahallin da lada. Waɗannan algorithms sun nuna ƙwararrun aiki a cikin manyan aikace-aikace masu girma mai girma.

Ƙarshe

Multi-armed bandits suna ba da ingantaccen tsari don yanke shawara na bi da bi a ƙarƙashin rashin tabbas tare da ƙayyadaddun martani. Muhimmin ciniki na bincike da amfani yana bayyana a cikin aikace-aikace da yawa na ainihi, daga gwaje-gwajen asibiti zuwa tsarin ba da shawara. Ƙaddamar da contextual bandit ya tabbatar da mahimmanci musamman don tsarin keɓance waɗanda suka dace da halayen mutum.

Wannan binciken ya ba da cikakken bayyani na manyan ci gaba a cikin multi-armed bandits, tare da mai da hankali kan aikace-aikace na ainihi. Mun bincika tsarin matsala, manyan algorithms, da yankuna daban-daban na aikace-aikace. Fagen yana ci gaba da haɓaka cikin sauri, tare da sabbin algorithms suna magance ƙalubale kamar rashin tsayayyi, manyan wuraren aiki, da ƙuntatawa na aminci.

Yayin da algorithms na bandit suka zama masu ƙware kuma ana amfani da su ga matsaloli masu rikitarwa, za su ci gaba da taka muhimmiyar rawa wajen inganta yanke shawara a fannoni daban-daban. Binciken da ake ci gaba da yi a wannan yanki yana alƙawarin samar da ƙarin ingantattun algorithms da faɗaɗa aikace-aikace a nan gaba.