1. Gabatarwa

Hasashen daidai na ilimin ɗalibi shine ginshiƙi don gina ingantattun tsarin koyo na musamman. Wannan takarda ta gabatar da sabon tsarin haɗin kai da aka tsara don hasashen kurakurai a matakin kalma (gibin ilimi) da ɗalibai masu koyon harshe na biyu suka yi akan dandalin Duolingo. Tsarin ya sami mafi girman maki akan duka ma'aunin kimantawa (AUC da F1-score) a cikin dukkan bayanan harsuna uku (Turanci, Faransanci, Sifen) a cikin Aikin Haɗin Kai na 2018 akan Tsarin Koyon Harshe na Biyu (SLAM). Aikin ya nuna yuwuwar haɗa tsarin jeri da na tushen fasali yayin da ake bincika sosai gibin da ke tsakanin ayyukan gwaji na ilimi da buƙatun samarwa na ainihi don koyo mai daidaitawa.

2. Bayanai da Tsarin Kimantawa

Binciken ya dogara ne akan bayanan ɗalibai daga Duolingo, wanda ya ƙunshi kwanaki 30 na farko na hulɗar masu amfani don masu koyon Turanci, Faransanci, da Sifen.

2.1. Bayyani Game da Bayanan

Bayanai sun haɗa da martanin mai amfani da aka daidaita da jerin amsoshi daidai ta amfani da hanyar mai canzawa mai iyaka. An riga an raba bayanan zuwa saiti na horo, ci gaba, da gwaji, tare da rabe-raben da aka yi bisa tsari na lokaci ga kowane mai amfani (kashi 10% na ƙarshe don gwaji). Abubuwan fasali sun haɗa da bayanan matakin alama, alamomin sashi na magana, da bayanan bayanan aiki, amma abin lura, ba a ba da ainihin jumlar shigarwar mai amfani ba.

2.2. Aiki da Ma'auni

Aikin asali shine rarrabuwa biyu: hasashen ko wata kalma (alama) ta musamman a cikin martanin mai koyo za ta zama kuskure. Ana kimanta aikin tsarin ta amfani da Yankin Ƙarƙashin Lankwalin ROC (AUC) da makin F1, wanda aka ƙaddamar ta hanyar uwar garken kimantawa.

2.3. Iyakoki don Aiki na Ainihi

Marubutan sun gano iyakoki guda uku masu mahimmanci na tsarin aikin SLAM don keɓancewa na ainihin lokaci:

  1. Zubar da Bayanai: Hasashe yana buƙatar "madaidaiciyar jumla daidai," wanda ba a san shi a baya ba don tambayoyi masu buɗe kofa.
  2. Zubar da Bayanan Lokaci: Wasu abubuwan fasali da aka bayar sun ƙunshi bayanan gaba.
  3. Babu Yanayin Farawa Sanye: Kimantawar ba ta haɗa da sababbin masu amfani na gaske ba, saboda duk masu amfani sun bayyana a cikin bayanan horo.

Wannan yana nuna babban gibin da ke tsakanin gasa na ilimi da mafita na EdTech da za a iya aiwatarwa.

3. Hanya

Mafita da aka gabatar ita ce haɗin kai wanda ke amfani da ƙarfin haɗin kai na nau'ikan tsari daban-daban guda biyu.

3.1. Tsarin Haɗin Kai

An samar da hasashe na ƙarshe ta hanyar haɗa sakamakon tsarin Bishiyar yanke shawara mai haɓakawa (GBDT) da tsarin hanyar sadarwar jijiya mai maimaitawa (RNN). GBDT ta yi fice wajen koyon hadaddun hulɗa daga abubuwan fasali masu tsari, yayin da RNN ke ɗaukar dogaro na lokaci a cikin jerin koyon ɗalibi.

3.2. Abubuwan Tsarin

  • Bishiyoyin yanke shawara masu haɓakawa (GBDT): An yi amfani da shi saboda ƙarfinsa da ikon sarrafa nau'ikan bayanai gauraye da alaƙar da ba ta layi ba da ke cikin saitin fasali (misali, wahalar aiki, lokacin da aka yi bita na ƙarshe).
  • Hanyar sadarwar jijiya mai maimaitawa (RNN): Musamman, tsarin da aka yi wahayi daga Zurfin Binciken Ilimi (DKT), wanda aka tsara don ƙirƙirar juyin halitta na jeri na yanayin ilimin ɗalibi akan lokaci, yana ɗaukar al'amuran mantawa da koyo.

3.3. Cikakkun Bayanai na Fasaha & Tsarin Lissafi

Ƙarfin hasashe na haɗin kai ya samo asali ne daga haɗa yuwuwar. Idan $P_{GBDT}(y=1|x)$ shine yuwuwar hasashen kuskure na GBDT, kuma $P_{RNN}(y=1|s)$ shine yuwuwar RNN da aka ba da jerin $s$, haɗin kai mai sauƙi amma mai inganci shine matsakaicin ma'auni:

$P_{ensemble} = \alpha \cdot P_{GBDT} + (1 - \alpha) \cdot P_{RNN}$

inda $\alpha$ shine ma'auni mai girma wanda aka inganta akan saitin ci gaba. RNN yawanci tana amfani da tantanin halitta na Gajeren Lokaci Mai Tsayi (LSTM) don sabunta yanayin ilimi ɓoye $h_t$ a matakin lokaci $t$:

$h_t = \text{LSTM}(x_t, h_{t-1})$

inda $x_t$ shine vector ɗin fasali don aikin yanzu. Ana yin hasashe ta hanyar layi mai cikakken haɗin kai: $P_{RNN} = \sigma(W \cdot h_t + b)$, inda $\sigma$ shine aikin sigmoid.

4. Sakamako & Tattaunawa

4.1. Ayyuka akan SLAM 2018

Tsarin haɗin kai ya sami mafi girman maki akan duka AUC da F1-score don dukkan bayanan harsuna uku a cikin gasar, yana nuna ingancinsa. Marubutan sun lura cewa yayin da aikin ya kasance mai ƙarfi, kurakurai sau da yawa suna faruwa a cikin yanayin harshe mai rikitarwa ko tare da alamomi da ba a saba gani ba, suna nuna wuraren da za a inganta ta hanyar ingantaccen injiniyan fasali ko haɗa abubuwan fifikon harshe.

4.2. Bayanin Taswira & Sakamako

Taswirar Ayyuka na Hasashe (Bisa Bayanin Takarda): Taswira mai sanduna za ta nuna makin AUC don tsarin Haɗin kai da aka gabatar, GBDT mai zaman kansa, da RNN mai zaman kansa (ko ma'auni na DKT) a cikin saitunan gwajin Turanci, Faransanci, da Sifen. Sandunan Haɗin kai za su kasance mafi tsayi ga kowane harshe. Taswira ta biyu mai rukunin sanduna za ta nuna irin wannan don makin F1. Abin gani zai nuna a fili "fa'idar haɗin kai," inda aikin haɗin tsarin ya wuce na kowane ɓangare na mutum ɗaya, yana tabbatar da haɗin gwiwar hanyar haɗin kai.

5. Tsarin Bincike & Misalin Lamari

Tsarin don Kimanta Tsarin Hasashen EdTech:

  1. Amincin Aiki: Shin aikin hasashe yayi daidai da ainihin matakin yanke shawara a cikin samfurin? (Aikin SLAM: Ƙaramin aminci saboda zubar da bayanai).
  2. Haɗin Tsarin: Shin za a iya haɗa sakamakon tsarin cikin sauƙi cikin injin ba da shawara? (Makin haɗin kai na iya zama sigina kai tsaye don zaɓin abu).
  3. Jinkiri & Ma'auni: Shin zai iya yin hasashe cikin sauri don miliyoyin masu amfani? (GBDT yana da sauri, ana iya inganta RNN; haɗin kai na iya ƙara nauyi).
  4. Gibin Fahimta: Shin malamai ko ɗalibai za su iya fahimtar dalilin da ya sa aka yi hasashe? (GBDT tana ba da wasu mahimman fasali; RNN akwatin baƙi ne).

Misalin Lamari (Babu Lamba): Ka yi la'akari da ɗalibi, "Alex," yana fama da fi'ili na lokacin da ya wuce na Faransanci. Bangaren GBDT na iya gano cewa Alex ya ci gaba da kasawa akan ayyukan da aka yiwa alama da "past_tense" da "irregular_verb." Bangaren RNN ya gano cewa kurakurai suna taruwa a cikin zaman da ke biyo bayan hutun kwanaki 3, yana nuna mantawa. Haɗin kai ya haɗa waɗannan siginonin, yana hasashen babban yuwuwar kuskure akan aikin fi'ili na lokacin da ya wuce da ba na yau da kullun ba. Tsarin na musamman zai iya shiga tsakani tare da bita mai niyya ko alama kafin gabatar da wannan aikin.

6. Ra'ayin Masanin Masana'antu

Rarrabuwa mai mahimmanci, mai ra'ayi game da abubuwan da takardar ke nufi ga sashin EdTech.

6.1. Cikakken Fahimta

Ƙimar gaske ta takardar ba wai kawai wani tsarin gasa mai nasara ba ne; amma shi ne yarda a ɓoye cewa fagen ya makale a cikin mafi kyawun gida. Mun yi hazaka wajen gina tsarin da ke cin nasara a kan ma'auni kamar SLAM amma sau da yawa muna butulci game da gaskiyar aiki na tura su. Dabarar haɗin kai (GBDT+RNN) ta wayo amma ba abin mamaki ba—daidai yake da kawo duka wuka da guduma a cikin akwatin kayan aiki. Mafi ƙwaƙƙwaran fahimta an binne shi a cikin tattaunawar: allunan jagoranci na ilimi suna zama wakilai marasa kyau na AI mai shirye-shiryen samfur. Takardar a ɓoye tana jayayya cewa muna buƙatar tsarin kimantawa waɗanda ke hukunta zubar da bayanai kuma su ba da fifikon aikin farawa sanye, matsayin da ya kamata a yi ihu, ba a rada ba.

6.2. Tsarin Ma'ana

Hujjar tana gudana daga tushe mai ƙarfi: gano gibin ilimi yana da mahimmanci. Sa'an nan kuma ta gabatar da mafita mai inganci ta fasaha (haɗin kai) wanda ya ci ma'auni. Duk da haka, ma'ana ta ɗauki juyi mai mahimmanci ta hanyar rushe ainihin ma'aunin da ta ci. Wannan sukar mai tunani shine mafi kyawun kayan takarda. Yana bin tsari: "Ga abin da ke aiki a cikin dakin gwaje-gwaje. Yanzu, bari mu yi magana game da dalilin da ya sa tsarin dakin gwaje-gwaje ya gaza gaba ɗaya don bene na masana'anta." Wannan motsi daga gini zuwa suka shine abin da ke raba gudummawar bincike mai amfani da shigar gasa kawai.

6.3. Ƙarfafawa & Kurakurai

Ƙarfafawa:

  • Ƙirar Haɗin Kai Mai Amfani: Haɗa aikin fasali na tsaye (GBDT) tare da tsarin lokaci (RNN) hanya ce da aka tabbatar, mara haɗari don samun ribar aiki. Yana guje wa tarkon injiniya fiye da kima.
  • Sukar da ke da Masaniyar Samarwa: Tattaunawar game da iyakokin aiki tana da matuƙar ƙima ga manajoji na samfura da injiniyoyin ML. Gaskiyar gaskiya ce masana'antar ke buƙata sosai.

Kurakurai & Damar da aka rasa:

  • Mara zurfi akan "Yaya": Takardar ba ta da nauyi akan cikakkun bayanai na yadda za a haɗa tsarin (matsakaici mai sauƙi? ma'auni da aka koya? tari?). Wannan shine cikakken bayanin injiniya mai mahimmanci.
  • Bari da Fahimtar Tsarin: A cikin yanki da ke shafar koyo, "dalilin" da ke bayan hasashe yana da mahimmanci don gina amana tare da masu koyo da malamai. Yanayin akwatin baƙi na haɗin kai, musamman RNN, babban cikas ne na tura ba a magance shi ba.
  • Babu Madadin Kimantawa: Yayin da ake sukar tsarin SLAM, ba ta gabatar ko gwada sake bita, ƙarin kimantawa na gaskiya na samarwa ba. Tana nuna matsalar amma ba ta fara tono tushen mafita ba.

6.4. Fahimta Mai Amfani

Ga kamfanonin EdTech da masu bincike:

  1. Bukatar Mafi kyawun Ma'auni: Dakatar da ɗaukar nasarar gasa a matsayin tabbatarwa ta farko. Yi kira kuma ku ba da gudummawa ga sababbin ma'auni waɗanda ke kwaikwayi ƙuntatawa na ainihin duniya—babu bayanan gaba, tsattsauran rabe-raben lokaci na matakin mai amfani, da hanyoyin farawa sanye.
  2. Karɓi Gine-ginen Haɗin Kai: Tsarin GBDT+RNN shine amintaccen zato ga ƙungiyoyin da ke gina tsarin bin diddigin ilimi. Fara can kafin korar ƙarin gine-gine na musamman, na guda ɗaya.
  3. Zuba Jari a cikin "MLOps don EdTech": Gibin ba wai kawai a cikin ginin tsari ba ne; yana cikin bututun. Gina tsarin kimantawa waɗanda ke ci gaba da gwadawa don karkatar da bayanai, karkatar da ra'ayi (yayin da manhajoji ke canzawa), da adalci a cikin ƙungiyoyin masu koyo.
  4. Ba da fifiko ga Fahimta daga Ranar Farko: Kada ku ɗauke shi a matsayin abin da za a yi tunani bayan haka. Bincika dabarun kamar SHAP don GBDT ko hanyoyin kulawa don RNN don ba da ra'ayi mai amfani (misali, "Kuna fama da nan saboda ba ku yi aiki da wannan doka ba cikin kwanaki 5").

7. Ayyuka na Gaba & Hanyoyi

  • Bayyan Kurakurai Biyu: Hasashen nau'in kuskure (na nahawu, na ƙamus, na tsarin jumla) don ba da damar ƙarin ra'ayi mai zurfi da hanyoyin gyara.
  • Canja wurin Tsakanin Harsuna & Yankuna: Yin amfani da al'amuran da aka koya daga miliyoyin masu koyon Turanci don ƙaddamar da tsarin don harsuna masu ƙarancin albarkatu ko ma fannoni daban-daban kamar lissafi ko lamba.
  • Haɗawa da Tsarin Fahimta: Haɗa ka'idoji daga kimiyyar fahimi, kamar algorithms na maimaitawa a tazara (kamar waɗanda ake amfani da su a Anki) kai tsaye cikin aikin manufar tsarin, motsawa daga hasashe kawai zuwa tsarin jadawali mafi kyau.
  • Ra'ayi Mai Haɓakawa: Yin amfani da wurin da aka hasashen kuskure da nau'in a matsayin shigarwa zuwa babban tsarin harshe (LLM) don samar da alamomi na musamman, harshe na halitta ko bayani a ainihin lokaci, motsawa daga ganowa zuwa tattaunawa.
  • Tsarin Yanayin Tasiri: Za a iya ƙaddamar da tsarin haɗin kai don haɗa masu hasashen aiki tare da masu gano shiga ko takaici (daga magudanar dannawa ko, inda ake samu, bayanan firikwensin) don ƙirƙirar cikakken tsarin yanayin mai koyo.

8. Bincike na Asali & Taƙaitawa

Wannan takarda ta Osika et al. tana wakiltar matakin balaga a cikin juyin halitta na Tonon Bayanai na Ilimi (EDM). Tana nuna ƙwarewar fasaha tare da tsarin haɗin kai mai nasara amma, mafi mahimmanci, tana nuna ƙarfin fahimtar kai a cikin fagen game da fassarar bincike zuwa aiki. Haɗin kai na GBDT da RNN zaɓi ne mai amfani, yana maimaita yanayin a wasu yankuna inda tsarin haɗin kai sau da yawa ya fi gine-gine masu tsafta. Misali, nasarar haɗin tsarin a cikin nasarar gasar Kaggle an rubuta shi sosai, kuma aikace-aikacensu a nan yana bin tsari mai dogaro. Duk da haka, gudummawar dindindin na takardar ita ce bincikenta mai mahimmanci na tsarin Aikin Haɗin Kai da kansa.

Marubutan sun gano daidai cewa zubar da bayanai da rashin yanayin farawa sanye na gaske sun sa allon jagora na SLAM ya zama alamar da ba ta cika ba na yuwuwar samarwa. Wannan ya yi daidai da ƙarin suka a cikin koyon injina, kamar waɗanda aka tayar a cikin takardar "CycleGAN" mai mahimmanci da tattaunawar da ke biyo baya kan bincike mai maimaitawa, waɗanda ke jaddada mahimmancin ka'idojin kimantawa waɗanda ke nuna amfani na ainihin duniya. Takardar a ɓoye tana jayayya don canzawa daga "daidaito-a-dukkan-farashi" ma'auni zuwa kimantawa mai sane da aiwatarwa, canjin da ƙungiyoyi kamar Cibiyar AI ta Allen suka ba da gudummawa a cikin NLP ta hanyar ma'auni kamar Dynabench.

Daga mahangar fasaha, hanyar tana da inganci amma ba juyin juya hali ba. Ainihin ƙirƙira yana cikin labari biyu na takardar: tana ba da girke-girke don tsarin mai inganci yayin da take tambayar kicin da aka dafa shi a ciki. Ga masana'antar EdTech, abin da za a ɗauka a bayyane yake: zuba jari a cikin ingantattun tsarin hasashe na haɗin kai yana da mahimmanci, amma bai isa ba. Dole ne a yi daidai da zuba jari a cikin gina tsarin kimantawa, bututun bayanai, da kayan aikin fahimta waɗanda ke haɗa gibin tsakanin dakin gwaje-gwaje da allon mai koyo. Makomar koyo na musamman ya dogara ba kawai akan hasashen kurakurai daidai ba, amma akan gina tsarin AI masu aminci, masu ma'auni, da haɗin kai na ilimi—ƙalubalen da ya wuce inganta makin AUC.

9. Nassoshi

  1. Osika, A., Nilsson, S., Sydorchuk, A., Sahin, F., & Huss, A. (2018). Tsarin Koyon Harshe na Biyu: Hanyar Haɗin Kai. arXiv preprint arXiv:1806.04525.
  2. Settles, B., Brunk, B., Gustafson, L., & Hagiwara, M. (2018). Tsarin Koyon Harshe na Biyu. Proceedings of the NAACL-HLT 2018 Workshop on Innovative Use of NLP for Building Educational Applications.
  3. Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L. J., & Sohl-Dickstein, J. (2015). Zurfin bin diddigin ilimi. Advances in neural information processing systems, 28.
  4. Lord, F. M. (1952). Ka'idar makin gwaji. Psychometric Monographs, No. 7.
  5. Bauman, K., & Tuzhilin, A. (2014). Ba da shawarar kayan koyo na gyara ga ɗalibai ta hanyar cika gibin iliminsu. MIS Quarterly.
  6. Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Fassarar hoto zuwa hoto mara haɗin gwiwa ta amfani da hanyoyin sadarwar adawa na zagayowar daidaito. Proceedings of the IEEE international conference on computer vision (Takardar CycleGAN da aka ambata don sukar hanyoyin).
  7. Mohri, M. (1997). Masu canzawa masu iyaka a cikin sarrafa harshe da magana. Computational linguistics, 23(2), 269-311.