Zaɓi Harshe

CPG-EVAL: Ma'auni Mai Matakai Daban-Daban Don Kimanta Ƙwarewar Nahawun Koyarwar Sinanci na Manyan Harsunan Na'ura

Ya gabatar da CPG-EVAL, ma'auni na farko don kimanta ilimin nahawun koyarwar Sinanci na Manyan Harsunan Na'ura (LLMs) bisa tsari, yana tantance gane, bambancewa, da juriya ga ruɗani.
study-chinese.com | PDF Size: 1.0 MB
Kima: 4.5/5
Kimarku
Kun riga kun ƙididdige wannan takarda
Murfin Takardar PDF - CPG-EVAL: Ma'auni Mai Matakai Daban-Daban Don Kimanta Ƙwarewar Nahawun Koyarwar Sinanci na Manyan Harsunan Na'ura

1. Gabatarwa

Takardar ta fara da kwatanci mai tada hankali: amfani da Manyan Harsunan Na'ura (LLMs) kamar ChatGPT a matsayin malamai ba tare da tantance su yadda ya kamata ba, kamar yarda da malamai marasa cancanta su koyar da ɗalibai. Wannan yana nuna babban gibi. Duk da cewa LLMs suna nuna alamar ci gaba a cikin ilimin harsunan waje (misali, ƙirƙirar abun ciki, gyara kurakurai), ainihin ƙwarewar nahawun koyarwa—ƙarfin fahimta da bayyana ƙa'idodin nahawu ta hanyar da za a iya koyarwa, tare da sanin mahallin—har yanzu ba a auna su sosai ba. Marubutan sun yi iƙirarin cewa ma'aunin NLP da ke akwai bai isa ga wannan aiki na musamman ba. Saboda haka, sun gabatar da CPG-EVAL (Tantance Nahawun Koyarwar Sinanci), ma'auni na farko na musamman, mai matakai daban-daban da aka ƙera don tantance ilimin nahawun koyarwa na LLMs bisa tsari a cikin mahallin Koyar da Sinanci a matsayin Harshen Waje (TCFL).

2. Ayyukan Da Suka Danganta

Takardar ta sanya CPG-EVAL a cikin nau'ikan bincike guda biyu. Na farko, ta yi bitar ƙaruwar amfani da LLMs a cikin ilimin harshe, ta rufe fagage kamar tantance rubutu ta atomatik, aikin magana, da haɓaka albarkatu (misali, Bin-Hady et al., 2023; Kohnke et al., 2023). Na biyu, tana tattauna ci gaban ma'aunin AI, daga ayyuka na gabaɗaya (misali, GLUE, SuperGLUE) zuwa ƙarin tantancewa na musamman. Marubutan sun lura da rashin ma'auni da suka dogara da ka'idar koyarwa da ƙwarewar koyar da harshe, wanda CPG-EVAL ke nufin magancewa ta hanyar haɗa ilimin harshe na kwamfuta da ilimin harshe na amfani don TCFL.

3. Ma'aunin CPG-EVAL

3.1. Tushen Ka'ida & Ka'idojin Ƙira

CPG-EVAL ya dogara ne akan tsarin rarrabuwar nahawun koyarwa da aka tabbatar da shi ta hanyar aikin TCFL mai yawa. An jagorance shi da ƙa'idodin daidaitawar koyarwa, yana tabbatar da cewa ayyukan suna nuna yanayin koyarwa na ainihi. Ma'aunin baya tantance daidaiton nahawu kawai, har ma da ƙarfin samfurin na yin ayyukan da suka dace da malami ko mai koyarwa, kamar gano kurakurai, bayyana ƙa'idodi, da zaɓar misalan koyarwa masu dacewa.

3.2. Rarrabuwar Ayyuka & Tsarin Kimantawa

Ma'aunin ya ƙunshi manyan ayyuka guda biyar, yana ƙirƙirar tsarin kimantawa mai matakai daban-daban:

  1. Gane Nahawu: Gane ko jumla da aka bayar ta yi amfani da wani maƙasudin batu na nahawu daidai.
  2. Bambanci Mai Ƙarfi: Bambance tsakanin gine-ginen nahawu ko amfani da su waɗanda suka yi kama da juna.
  3. Bambance Rukuni: Rarraba kurakuran nahawu ko jimloli zuwa takamaiman rukunoni na koyarwa (misali, rashin amfani da "了", kuskuren tsarin kalmomi).
  4. Juriya ga Tsangwama na Harshe (Misali Guda): Tantance ƙarfin samfurin na sarrafa misali ɗaya mai ruɗani ko karkatarwa.
  5. Juriya ga Tsangwama na Harshe (Misalai Da Yawa): Wani nau'i mai ƙalubale inda samfurin dole ne ya yi tunani a kan misalai da yawa masu yuwuwar ruɗani.

An ƙera wannan tsari don bincika zurfin fahimtar koyarwa daban-daban, daga ainihin gane zuwa tunani mai zurfi a ƙarƙashin ruɗani.

4. Tsarin Gwaji & Sakamako

4.1. Samfura & Ƙa'idar Kimantawa

Binciken ya tantance nau'ikan LLMs daban-daban, gami da ƙananan samfura (misali, samfura ƙasa da sigogi 10B) da manyan samfura (misali, GPT-4, Claude 3). An gudanar da tantancewa a cikin yanayin "zero-shot" ko "few-shot" don tantance ƙarfin asali. An auna aikin da farko ta hanyar daidaito akan ayyukan da aka ayyana.

4.2. Babban Bincike & Bita Ayyuka

Sakamakon ya bayyana babban matsayi na aiki:

  • Ƙananan samfura na iya samun nasara mai ma'ana akan ayyuka masu sauƙi, na misali guda (kamar ainihin Gane Nahawu) amma aikinsu ya faɗi ƙasa sosai akan ayyukan da suka haɗa da misalai da yawa ko tsangwama mai ƙarfi na harshe. Wannan yana nuna cewa ba su da ƙarfin tunanin nahawu mai ƙarfi, wanda za a iya amfani da shi gabaɗaya.
  • Manyan samfura (misali, GPT-4) sun nuna juriya mafi kyau ga tsangwama kuma suna sarrafa ayyukan misalai da yawa cikin inganci, yana nuna ƙarfin tunani da fahimtar mahalli. Duk da haka, daidaitonsu har yanzu ba shi da kyau sosai, yana nuna babban wurin haɓakawa.
  • Aikin gabaɗaya a duk samfuran ya nuna cewa LLMs na yanzu, ko da girman su, ba su da ƙwarewar nahawun koyarwa na Sinanci har yanzu bisa dogaro. Ma'aunin ya yi nasarar fallasa takamaiman raunuka, kamar ruɗani tsakanin ɓangarorin nahawu masu kama da juna ko gazawar aiwatar da ƙa'idodi iri ɗaya a kan misalai.

Bayanin Chati (Tunani): Chati mai sanduna da yawa zai nuna makin daidaito (0-100%) na iyalai samfura 4-5 a cikin ayyukan CPG-EVAL 5. Za a iya ganin alaƙa mai kyau tsakanin girman samfurin da aiki, tare da tazarar tsakanin manyan samfura da ƙananan samfura suna faɗaɗa sosai don Aiki na 4 musamman Aiki na 5 (ayyukan tsangwama). Duk samfuran za su nuna mafi ƙananan maki a kan Aiki na 5.

Mahimmin Ma'auni: Tazarar Aiki

~40%

Bambancin daidaito tsakanin manyan samfura da ƙananan samfura akan ayyukan tsangwama masu sarkakiya.

Girman Ma'auni

Matakai 5

Ƙirar aiki mai matakai daban-daban da ke bincika matakan ƙwarewa daban-daban.

Babban Iyaka Da Aka Fallasa

Rashin Daidaitawar Koyarwa

LLMs ba su da ƙwarewar bayyana nahawu mai koyarwa, mai sanin mahalli.

5. Babban Fahimta & Ra'ayi Mai Bincike

Babban Fahimta: CPG-EVAL ba wani gwajin daidaito kawai bane; yana duba gaskiyar hawan EdTech na AI. Ya nuna a zahiri cewa "hankali" na nahawu na ko da mafi ci gaban LLMs yana da zurfi kuma bai dace da koyarwa ba. Suna wucewa a matsayin masu magana na yau da kullun amma suna kasa a matsayin malamai masu tsari.

Tsarin Ma'ana: Takardar ta motsa da fasaha daga gano buƙata mai mahimmanci ta kasuwa (tantance malaman AI) zuwa warware matsalar (menene ƙwarewar koyarwa?) kuma a ƙarshe zuwa gina madaidaicin mafita mai jagora ta ka'ida. Tsarin ayyuka biyar shine babban fasalin sa, yana ƙirƙirar matakin wahala wanda ke raba haddacewa da ainihin fahimta.

Ƙarfi & Aibobi: Babban ƙarfinsa shine tushen koyarwa. Ba kamar ma'auni na gabaɗaya ba, an gina shi don fagen TCFL kuma ta hannun fagen TCFL. Wannan yayi daidai da falsafar da ke bayan ma'auni kamar MMLU (Fahimtar Harshe Mai Yawa) wanda ke tattara ilimin ƙwararru a fannoni daban-daban, amma CPG-EVAL ya zurfafa cikin fage ɗaya na amfani. Wani aibi mai yuwuwa shine mayar da hankali a halin yanzu akan tantancewa fiye da haɓakawa. Ya gano cutar da fasaha amma ya ba da ƙayyadaddun magani. Ayyukan gaba dole ne su haɗa aikin akan CPG-EVAL zuwa takamaiman dabarun daidaitawa ko daidaitawa, kamar yadda aka haɓaka RAG (Ƙirƙirar Haɗe da Maido) don magance matsalolin ruɗani da aka gano ta ma'auni na farko.

Fahimta Mai Aiki: Ga kamfanonin EdTech, wannan kayan aiki ne na wajibi—kar a taɓa amfani da mai koyar da Sinanci na tushen LLM ba tare da gudanar da CPG-EVAL ba. Ga masu haɓaka samfura, ma'aunin yana ba da takamaiman taswirar hanya don "daidaitawar koyarwa," wani sabon fage bayan AI na tsarin mulki. Ƙananan maki akan ayyukan tsangwama suna nuna cewa horo akan tarin bayanai da aka tsara, waɗanda aka tsara don koyarwa—kamar dabarun bayanan roba da aka yi amfani da su a cikin DALL-E 3 ko AlphaCode 2—yana da mahimmanci. Ga malamai da masu tsara manufofi, binciken hujja ce mai ƙarfi don ƙa'idoji da takaddun shaida a cikin ilimin taimakon AI. Zaman amincewa makaho ga malaman AI ya ƙare.

6. Cikakkun Bayanai na Fasaha & Tsarin Lissafi

Duk da cewa samfotin PDF bai yi cikakken bayani kan ƙididdiga masu sarkakiya ba, ana iya tsara ma'anar tantancewa. Babban ma'auni shine daidaito ga samfurin $M$ akan aiki $T_i$ daga ma'aunin $B$ wanda ya ƙunshi misalai $n$:

\[ \text{Daidaito}(M, T_i) = \frac{1}{|D_{T_i}|} \sum_{x \in D_{T_i}} \mathbb{I}(\hat{y}_x = y_x) \]

inda $D_{T_i}$ shine tarin bayanai don aikin $i$, $\hat{y}_x$ shine hasashen samfurin don misali $x$, $y_x$ shine ainihin lakabin, kuma $\mathbb{I}$ shine aikin nuna alama.

Babban ƙirƙira shine gina $D_{T_i}$, musamman don ayyukan tsangwama. Waɗannan suna haɗawa da misalai marasa kyau da aka sarrafa ko karkatarwa na adawa. Misali, a cikin aikin gwada bambanci tsakanin "$\text{了}$" (le) don aikin da aka kammala da canjin yanayi, misalin tsangwama zai iya zama: "他病了三天。" (Ya yi rashin lafiya kwana uku.) da "他病三天了。" (Ya yi rashin lafiya kwana uku.). Ƙaramin bambancin yana gwada zurfin fahimtar nahawu da ma'ana.

7. Tsarin Bita: Misalin Lamari

Yanayi: Tantance fahimtar LLM game da ginin "$\text{把}$" (bǎ), wata ƙalubale ta al'ada a cikin TCFL.

Aiwatar da Aikin CPG-EVAL:

  1. Gane (Aiki 1): Gabatar da: "我把书放在桌子上。" (Na ajiye littafin akan tebur.) Samfurin dole ne ya yi hukunci daidai.
  2. Bambanci Mai Ƙarfi (Aiki 2): Bambance "我把书看了。" (Na karanta littafin.) da "书被我看了。" (An karanta littafin ta ni.). Samfurin dole ne ya bayyana canjin mayar da hankali daga wakili zuwa mara lafiya.
  3. Bambance Rukuni (Aiki 3): An ba da kuskure: "我放书在桌子上。" (Na ajiye littafin akan tebur.)—rashin "$\text{把}$". Samfurin dole ne ya rarraba nau'in kuskure a matsayin "Rashin ginin BA inda ake buƙata."
  4. Tsangwama - Guda (Aiki 4): Bayar da jumla mai daidai da ruɗani wacce ba ta amfani da "$\text{把}$" amma zata iya: "我打开了门。" (Na buɗe kofa.) da "我把门打开了。" Samfurin dole ne ya gane duka suna da ingancin nahawu amma sun bambanta a zahiri.
  5. Tsangwama - Da Yawa (Aiki 5): Bayar da tarin jimloli, wasu suna amfani da "$\text{把}$" daidai, wasu ba daidai ba, wasu kuma suna amfani da madadin tsari. Tambaya: "Wadanne jimloli guda biyu ke nuna mayar da hankali iri ɗaya na nahawu akan abu?" Wannan yana buƙatar tunani a kan jimloli.

Wannan lamarin yana nuna yadda CPG-EVAL ke motsawa daga sauƙaƙan daidaita tsari zuwa tunanin koyarwa mai zurfi.

8. Ayyukan Gaba & Hanyoyin Bincike

  • Faɗaɗa Ma'auni: Ƙara CPG-EVAL zuwa wasu harsuna (misali, Koriya, Larabci) tare da nahawun koyarwa masu sarkakiya.
  • Daga Tantancewa Zuwa Haɓakawa: Amfani da CPG-EVAL a matsayin sigina na horo don daidaitawar koyarwa ta hanyar daidaitawa, ƙirƙirar LLMs da aka inganta musamman don matsayin koyarwa.
  • Haɗawa da Dandamalin Ilimi: Shigar da ƙananan sassan kimantawa kamar CPG-EVAL a cikin dandamalin EdTech don ci gaba da sa ido kan ingancin mai koyarwa na AI.
  • Kimantawa Mai Nau'i Daban-Daban: Ma'auni na gaba zai iya tantance ƙarfin AI na bayyana nahawu ta amfani da zane-zane, motsin jiki, ko canza lamba, ya wuce rubutu kawai.
  • Tsawon Lokaci & Tantancewa Mai Daidaitawa: Haɓaka ma'auni waɗanda ke bin ƙarfin samfurin na daidaita bayanansa zuwa matakin ƙwarewar ɗalibi da ake koyi, mataki zuwa ga ainihin koyarwa ta AI ta keɓance.

9. Nassoshi

  1. Wang, D. (2025). CPG-EVAL: Ma'auni Mai Matakai Daban-Daban Don Kimanta Ƙwarewar Nahawun Koyarwar Sinanci na Manyan Harsunan Na'ura. arXiv preprint arXiv:2504.13261.
  2. Bin-Hady, W. R. A., Al-Kadi, A., Hazaea, A., & Ali, J. K. M. (2023). Bincika Girman ChatGPT a cikin Koyon Harshen Turanci: Ra'ayi na Duniya. Library Hi Tech.
  3. Kohnke, L., Moorhouse, B. L., & Zou, D. (2023). ChatGPT don Koyar da Harshe da Koyo. Rahoton RELC.
  4. Srivastava, A., et al. (2022). Bayan Wasan Kwaikwayo: Ƙididdigewa da Ƙaddamar da Ƙarfin Harsunan Na'ura. arXiv preprint arXiv:2206.04615.
  5. Liang, P., et al. (2023). Cikakken Kimanta Harsunan Na'ura. Mujallar Binciken Koyon Na'ura.
  6. Hendrycks, D., et al. (2021). Auna Fahimtar Harshe Mai Yawa. Proceedings of ICLR.
  7. Lewis, P., et al. (2020). Ƙirƙirar Haɗe da Maido don Ayyukan NLP Masu Ilimi Mai Zurfi. Ci gaba a cikin Tsarin Bayanai na Neural.