Zaɓi Harshe

Koyon Ayyuka Da Yawa Don Ƙirƙirar Samun Harshe Na Biyu A Cikin Yanayin Ƙarancin Albarkatu

Takarda bincike da ke gabatar da sabuwar hanyar koyon ayyuka da yawa don inganta aikin ƙirƙirar samun harshe na biyu ta hanyar amfani da tsarin gama-gari a cikin bayanan koyon harshe daban-daban.
study-chinese.com | PDF Size: 1.2 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - Koyon Ayyuka Da Yawa Don Ƙirƙirar Samun Harshe Na Biyu A Cikin Yanayin Ƙarancin Albarkatu

1. Gabatarwa

Ƙirƙirar Samun Harshe Na Biyu (SLA) wani nau'i ne na musamman na Binciken Ilimi (KT) wanda ke mai da hankali kan hasashen ko masu koyon harshe za su iya amsa tambayoyi daidai bisa tarihin karatunsu. Shi ne babban sashi na tsarin koyo na musamman. Duk da haka, hanyoyin da suke akwai suna fuskantar wahala a cikin yanayin ƙarancin albarkatu saboda rashin isassun bayanan horo. Wannan takarda ta magance wannan gibi ta hanyar gabatar da sabuwar hanyar koyon ayyuka da yawa wacce ke amfani da tsarin gama-gari a cikin bayanan koyon harshe daban-daban don inganta aikin hasashe, musamman lokacin da bayanai ba su da yawa.

2. Bayanan Baya & Ayyukan Da Suka Danganta

An tsara ƙirƙirar SLA a matsayin aikin rarrabe kalma-kalma biyu. Idan aka ba da aikin motsa jiki (misali, sauraro, fassara), model ɗin yana hasashen ko ɗalibi zai amsa kowace kalma daidai bisa bayanan aikin da jimla daidai. Hanyoyin gargajiya suna horar da model ɗin daban-daban a kowane bayanan harshe, wanda ke sa su zama masu rauni ga ƙarancin bayanai. Matsalolin ƙarancin albarkatu suna tasowa ne daga ƙananan girman bayanai (misali, don harsuna da ba a saba da su kamar Czech) da kuma yanayin farawa na mai amfani lokacin fara sabon harshe. Koyon Ayyuka Da Yawa (MTL), wanda ke inganta gama-gari ta hanyar koyon ayyukan da suka dangan tare, shine mafita mai ban sha'awa amma ba a bincika sosai ba don wannan yanki.

3. Hanyoyin Da Aka Gabatar

3.1 Tsarin Matsala

Ga wani harshe $L$, an wakilta jerin ayyukan motsa jiki ga ɗalibi. Kowane aiki ya ƙunshi bayanan meta, jimla daidai, da amsar ɗalibin. Manufar ita ce a yi hasashen alamar daidaiton kalma-kalma biyu ga kowace kalma a cikin amsar ɗalibin.

3.2 Tsarin Koyon Ayyuka Da Yawa

Babban hasashe shine cewa tsarin da ke ɓoye a cikin koyon harshe (misali, nau'ikan kurakuran nahawu na gama-gari, lanƙwan koyo) ana raba su a cikin harsuna daban-daban. Tsarin MTL da aka gabatar yana horar da bayanan harshe da yawa tare. Kowane aikin harshe yana da sigogi na musamman na aikin, yayin da mai ɓoyewa na gama-gari yana koyon wakilcin gama-gari na halayen mai koyo da siffofin harshe.

3.3 Tsarin Ƙirar Model

Model ɗin yana iya amfani da ginshiƙin hanyar sadarwa ta jijiya na gama-gari (misali, LSTM ko mai ɓoyewa na Transformer) don sarrafa jerin shigarwa daga duk harsuna. Sa'an nan kuma yadudduka na fitarwa na musamman na aikin suna yin hasashe ga kowane harshe. Aikin asara shine jimlar asarar da aka yi la'akari daga dukkan ayyuka: $\mathcal{L} = \sum_{t=1}^{T} \lambda_t \mathcal{L}_t$, inda $T$ shine adadin ayyukan harshe kuma $\lambda_t$ ma'auni ne na daidaitawa.

4. Gwaje-gwaje & Sakamako

4.1 Bayanan Gwaji & Saiti

Gwaje-gwaje suna amfani da bayanan SLA na jama'a daga Aikin Raba na Duolingo (NAACL 2018), wanda ya ƙunshi harsuna kamar Ingilishi, Sifen, Faransanci, da Czech. An ɗauki bayanan Czech a matsayin babban yanayin ƙarancin albarkatu. Ma'aunin kimantawa ya haɗa da AUC-ROC da Daidaito don aikin rarrabe kalma-kalma.

4.2 Hanyoyin Tushe

Hanyoyin tushe sun haɗa da model ɗin aiki ɗaya da aka horar da su da kansu a kan kowane harshe (misali, koma baya na logistic, model ɗin KT na tushen LSTM kamar DKT), waɗanda ke wakiltar hanyar daidaitaccen hanya.

4.3 Sakamako Na Musamman

Hanyar koyon ayyuka da yawa da aka gabatar ta fi dukkan hanyoyin tushe na aiki ɗaya a cikin saitunan ƙarancin albarkatu (misali, don Czech). An kuma lura da ingantattun, ko da yake sun fi sassauƙa, a cikin yanayin da ba na ƙarancin albarkatu ba (misali, Ingilishi), wanda ke nuna ƙarfin hanyar da ƙimar ilimin da aka canja.

Ingantaccen Aiki (Misali)

Ƙarancin Albarkatu (Czech): Model ɗin MTL ya sami kusan 15% mafi girma AUC fiye da model ɗin aiki ɗaya.

Yawan Albarkatu (Ingilishi): Model ɗin MTL yana nuna ɗan ingantacciyar (kusan 2%).

4.4 Nazarin Cirewa

Nazarin cirewa ya tabbatar da mahimmancin yadudduka na wakilcin gama-gari. Cire sashin ayyuka da yawa (watau, horo kawai akan bayanan ƙarancin albarkatu na manufa) yana haifar da faɗuwar aiki mai mahimmanci, yana tabbatar da cewa canja wurin ilimi shine babban abin da ke haifar da riba.

5. Nazari & Tattaunawa

5.1 Hasashe Na Musamman

Babban nasarar takardar ba sabon tsari ba ne, amma juyin dabarun dabarun: ɗaukar ƙarancin bayanai ba a matsayin laifi na ƙarshe ba, amma a matsayin damar canja wurin koyo. Ta hanyar tsara ayyukan koyon harshe daban-daban a matsayin matsalolin da suka dangan, marubutan sun kauce wa buƙatar manyan bayanai na musamman na harshe—wani babban cikas a cikin keɓancewar EdTech. Wannan yayi kama da sauyin tsari da aka gani a cikin hangen nesa na kwamfuta tare da model ɗin kamar ResNet, inda horo a baya akan ImageNet ya zama mafarin gama-gari. Hasashen cewa "koyon koyon" tsarin (misali, nau'ikan kurakurai na gama-gari kamar yarjejeniyar fi'ili-fi'ili ko rudani na sauti) fasaha ce da za a iya canjawa a cikin harsuna yana da ƙarfi kuma ba a yi amfani da ita sosai ba.

5.2 Tsarin Ma'ana

Hujjar tana da ma'ana kuma an tsara ta da kyau: (1) Gano babban matsalar zafi (gazawar ƙirƙirar SLA na ƙarancin albarkatu). (2) Gabatar da mafita mai ma'ana (MTL don canja wurin ilimin harshe). (3) Tabbatar da shaidar gwaji (sakamako mafi girma akan bayanan Czech/Ingilishi). (4) Bayanin injiniya (mai ɓoyewa na gama-gari yana koyon tsarin gama-gari). Gudu daga matsala zuwa hasashe zuwa tabbatarwa yana bayyana. Duk da haka, ma'ana ta yi ɗan tuntuɓe ta hanyar rashin ƙayyadaddun abin da ya ƙunshi "tsarin gama-gari na ɓoye." Shin yana da nahawu, sauti, ko alaƙa da ilimin halin ɗalibi? Takardar za ta fi ƙarfi tare da nazarin halin abin da ainihin mai ɓoyewa na gama-gari ya koya, kamar hangen nesa na hankali na gama-gari a cikin binciken NLP.

5.3 Ƙarfafawa & Kurakurai

Ƙarfafawa: Takardar tana magance matsala ta gaske, mai dacewa da kasuwanci a cikin EdTech. Hanyar MTL tana da kyau kuma tana da inganci a lissafi idan aka kwatanta da samar da bayanan roba. Sakamakon yana da gamsarwa, musamman ga yanayin ƙarancin albarkatu. Haɗin kai zuwa babban aikin raba na Duolingo yana ba da ma'auni mai inganci.

Kurakurai: Ayyukan ciki na model ɗin sun ɗan yi kama da akwatin baƙi. Akwai ƙaramin tattaunawa akan canja wuri mara kyau—menene faruwa lokacin da ayyuka suka yi nisa sosai kuma suka cutar da aiki? Zaɓin nau'ikan harshe don MTL yana da kamar ba bisa ka'ida ba; nazari na tsari akan kusancin dangin harshe (misali, Sifen-Italiyanci da Ingilishi-Japan) da tasirinsa akan canja wuri zai zama mai ƙima. Bugu da ƙari, dogaro akan bayanan Duolingo na 2018 ya sa aikin ya ɗan tsufa; fannin ya ci gaba da sauri.

5.4 Hasashe Mai Aiki

Ga ƙungiyoyin samfur a cikin aikace-aikacen koyon harshe (Duolingo, Babbel, Memrise), wannan binciken shiri ne don inganta ƙwarewar mai amfani na farko da tallafawa harsuna na musamman. Aikin nan take shine aiwatar da bututun MTL wanda ke ci gaba da horarwa akan duk bayanan mai amfani a cikin harsuna, ta amfani da harsuna masu yawan albarkatu don ƙaddamar da model ɗin don sababbi, ƙarancin albarkatu. Ga masu bincike, mataki na gaba shine bincika ƙarin dabarun MTL na ci gaba kamar hanyoyin sadarwa masu sanin aiki ko koyon meta (misali, MAML) don daidaitawa kaɗan. Hasashen kasuwanci mai mahimmanci: wannan hanyar tana juya dukkan tushen mai amfani na kamfani a cikin duk harsuna zuwa kadarorin bayanai don inganta kowane samfur na tsaye, yana ƙara amfani da bayanai.

6. Cikakkun Bayanan Fasaha

Mahimmanci na fasaha ya ƙunshi mai ɓoyewa na gama-gari $E$ tare da sigogi $\theta_s$ da kawunan aiki na musamman $H_t$ tare da sigogi $\theta_t$ ga kowane aikin harshe $t$. Shigarwa don aikin motsa jiki a cikin harshe $t$ shine vector sifa $x_t$. Wakilcin gama-gari shine $z = E(x_t; \theta_s)$. Hasashen na musamman na aikin shine $\hat{y}_t = H_t(z; \theta_t)$. An horar da model ɗin don rage asarar haɗe: $\min_{\theta_s, \theta_1, ..., \theta_T} \sum_{t=1}^{T} \frac{N_t}{N} \sum_{i=1}^{N_t} \mathcal{L}(\hat{y}_t^{(i)}, y_t^{(i)})$, inda $N_t$ shine adadin samfuran don aikin $t$, $N$ shine jimlar samfuran, kuma $\mathcal{L}$ shine asarar giciye na binary. Wannan tsarin ma'auni yana taimakawa daidaita gudummawar daga ayyuka masu girma daban-daban.

7. Misalin Tsarin Nazari

Yanayi: Sabon dandalin koyon harshe yana son ƙaddamar da kwasa-kwasan a cikin Yaren mutanen Sweden (ƙarancin albarkatu) da Jamusanci (yawan albarkatu).
Aikace-aikacen Tsarin:

  1. Ma'anar Aiki: Ayyana ƙirƙirar SLA a matsayin babban aikin hasashe ga duka harsuna.
  2. Saitin Tsarin: Aiwatar da mai ɓoyewa na gama-gari na BiLSTM ko Transformer. Ƙirƙiri yadudduka na fitarwa na musamman na aiki biyu (ɗaya don Yaren mutanen Sweden, ɗaya don Jamusanci).
  3. Yarjejeniyar Horarwa: Haɗa horar da model ɗin akan bayanan hulɗar mai amfani da aka yi rajista daga duka kwasa-kwasan Jamusanci da Yaren mutanen Sweden tun daga ranar farko. Yi amfani da dabarun ma'auni na asara mai ƙarfi wanda da farko yana ba da ƙarin nauyi ga bayanan Jamusanci don daidaita mai ɓoyewa na gama-gari.
  4. Kimantawa: Ci gaba da sa ido kan aikin model ɗin Yaren mutanen Sweden (AUC) akan model ɗin tushe da aka horar da shi kawai akan bayanan Yaren mutanen Sweden. Ma'auni mai mahimmanci shine "rufe tazarar aiki" akan lokaci.
  5. Maimaitawa: Yayin da bayanan mai amfani na Yaren mutanen Sweden ya girma, a hankali gyara ma'aunin asara. Nazari ma'auni na hankali na mai ɓoyewa na gama-gari don gano waɗanne tsarin koyon Jamusanci suka fi tasiri ga hasashen Yaren mutanen Sweden (misali, tsarin suna mai haɗaka).
Wannan tsarin yana ba da tsarin tsari, mai dogaro da bayanai don amfani da albarkatun da ake da su don shigar sabuwar kasuwa.

8. Aikace-aikace Na Gaba & Jagorori

Aikace-aikace:

  • Keɓancewar Dandamali: Tsawaita MTL don canja wurin tsarin ba kawai a cikin harsuna ba, amma a cikin yankunan ilimi daban-daban (misali, daga lissafi zuwa dabaru na coding).
  • Tsarin Sa-kai Na Farko: Yin amfani da ingantattun hasashe na ƙarancin albarkatu don alamar masu koyo masu haɗari da wuri, ko da a cikin sababbin kwasa-kwasan tare da ƙaramin tarihin bayanai.
  • Samar da Abun Ciki: Sanar da samar da ayyukan motsa jiki na musamman na atomatik don harsuna masu ƙarancin albarkatu bisa ga ingantattun tsarin daga waɗanda suka sami nasara.
Jagororin Bincike:
  • Koyon Meta don SLA: Bincika Koyon Meta na Model-Agnostic (MAML) don ƙirƙirar model ɗin da zai iya daidaitawa da sabon harshe tare da ƴan misalai kaɗan.
  • Canja Wuri Mai Bayyanawa: Haɓaka hanyoyin fassara da kuma ganin ainihin abin da ake canjawa, ƙara amincin model.
  • MTL Mai Yawa: Haɗa bayanan yawa (magana, lokacin rubutu) cikin wakilcin gama-gari don ɗaukar ingantattun tsarin koyo.
  • MTL Tarayya: Aiwatar da tsarin a cikin hanyar kiyaye sirri ta amfani da koyon tarayya, yana ba da damar canja wurin ilimi ba tare da tattara bayanan mai amfani masu mahimmanci ba.
Haɗuwar MTL tare da manyan model ɗin harshe (LLMs) da aka horar da su a baya akan rubutu mai yawan harshe yana ba da babbar dama. Daidaita model ɗin kamar mBERT ko XLM-R akan bayanan SLA na harsuna da yawa zai iya haifar da masu hasashe masu ƙarfi da samfurin inganci.

9. Nassoshi

  1. Corbett, A. T., & Anderson, J. R. (1994). Binciken Ilimi: Ƙirƙirar samun ilimin tsari. Keɓancewar mai amfani da hulɗar mai amfani, 4(4), 253-278.
  2. Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L. J., & Sohl-Dickstein, J. (2015). Zurfin binciken ilimi. Ci gaba a cikin tsarin sarrafa bayanai na jijiya, 28.
  3. Settles, B., & Meeder, B. (2016). Model ɗin maimaitawa mai tazara mai horarwa don koyon harshe. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
  4. Ruder, S. (2017). Bayani game da koyon ayyuka da yawa a cikin cibiyoyin sadarwa masu zurfi. arXiv preprint arXiv:1706.05098.
  5. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Hankali shine duk abin da kuke buƙata. Ci gaba a cikin tsarin sarrafa bayanai na jijiya, 30.
  6. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Horon farko na masu canzawa masu zurfi biyu don fahimtar harshe. arXiv preprint arXiv:1810.04805.
  7. Finn, C., Abbeel, P., & Levine, S. (2017). Koyon meta na model-agnostic don saurin daidaitawa na hanyoyin sadarwa masu zurfi. International conference on machine learning (pp. 1126-1135). PMLR.