Link Search Menu Expand Document

klue_ner

KLUE benchmark - Named Entity Recognition(NER) task.

For more details, see KLUE Benchmark - NER Task - Overview description

Table of contents

  1. Dataset Informations
  2. How to use this dataset
  3. License

Dataset Informations

  • See codes in GitHub
  • Version:
    • 1.0.0 False: Initial release.
    • 1.1.0 (default): KLUE 1.1.0
  • Homepage: https://github.com/KLUE-benchmark/KLUE
  • Download size: 12.59 MiB
  • Dataset size: 11.71 MiB
  • Features:

    FeaturesDict({
        'labels': Sequence(Text(shape=(), dtype=tf.string)),
        'tokens': Sequence(Text(shape=(), dtype=tf.string)),
    })
    
  • Supervised keys: ('tokens', 'labels')
  • Splits:

    Split Name Num Examples
    train 21008
    dev 5000
  • Examples:

      tokens labels
    1



    B-PS
    I-PS
    O
    O
    2 3



    B-DT
    I-DT
    O
    O
    3



    B-PS
    I-PS
    I-PS
    I-PS
    4


    (
    B-PS
    I-PS
    I-PS
    O
    5



    B-PS
    I-PS
    I-PS
    I-PS
    6



    O
    O
    O
    B-PS
    7



    O
    O
    O
    O
    8 1
    4

    (
    B-DT
    I-DT
    I-DT
    O
    9



    O
    O
    O
    O
    10



    O
    O
    O
    O
  • Citation:

    @misc{park2021klue,
        title={KLUE: Korean Language Understanding Evaluation},
        author={Sungjoon Park and Jihyung Moon and Sungdong Kim and Won Ik Cho and Jiyoon Han and Jangwon Park and Chisung Song and Junseong Kim and Yongsook Song and Taehwan Oh and Joohong Lee and Juhyun Oh and Sungwon Lyu and Younghoon Jeong and Inkwon Lee and Sangwoo Seo and Dongjun Lee and Hyunwoo Kim and Myeonghwa Lee and Seongbo Jang and Seungwon Do and Sunkyoung Kim and Kyungtae Lim and Jongwon Lee and Kyumin Park and Jamin Shin and Seonghyun Kim and Lucy Park and Alice Oh and Jungwoo Ha and Kyunghyun Cho},
        year={2021},
        eprint={2105.09680},
        archivePrefix={arXiv},
        primaryClass={cs.CL}
    }
    

How to use this dataset

  • Installation:

    pip install tfds-korean
    
  • Use this dataset

    import tensorflow_datasets as tfds
    import tfds_korean.klue_ner
    
    dataset = tfds.load("klue_ner")
    

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. See also Copyright notice.