BRCC and SentiBahasaRojak

  • BRCC: Applying Bahasa Rojak's data augmentation algorithm to Malay Wikipedia corpus.
  • SemEval-2017 task 5 subtask1: Malay version of SemEval-2017 task 5 subtask1 constructed by human translation.
  • SentiBahasaRojak: A Bahasa Rojak sentiment analysis dataset. For product and movie reviews, we applied Bahasa Rojak's data augmentation algorithm to the Malay datasets. For stock reviews, we scraped from stock forums and hired 5 experts to label them.

資料與資源

標籤

Wikidata 關鍵字

  • Q3201279
  • Q1172284
  • Q2271421

基本資訊

資料類型 壓縮檔資料

管理資訊

產製者 Intelligent Information Service Research Lab, National Central University, Taiwan