Skip to content

Latest commit

 

History

History
90 lines (79 loc) · 4.07 KB

README.en-US.md

File metadata and controls

90 lines (79 loc) · 4.07 KB

The ATIS (Airline Travel Information System) Dataset

This repository contains ATIS Dataset in Python pickle format and Rasa NLU JSON format (https://rasa.com/docs/nlu/dataformat/#json-format), also this project provide codes to show how extract data from pickle file.

Data Sample

Raw format

   0:         flight: BOS i want to fly from boston at 838 am and arrive in denver at 1110 in the morning EOS
                              BOS                                        O
                                i                                        O
                             want                                        O
                               to                                        O
                              fly                                        O
                             from                                        O
                           boston                      B-fromloc.city_name
                               at                                        O
                              838                       B-depart_time.time
                               am                       I-depart_time.time
                              and                                        O
                           arrive                                        O
                               in                                        O
                           denver                        B-toloc.city_name
                               at                                        O
                             1110                       B-arrive_time.time
                               in                                        O
                              the                                        O
                          morning              B-arrive_time.period_of_day
                              EOS                                        O

Rasa NLU JSON format

{
    "rasa_nlu_data": {
        "common_examples": [
            {
                "text": "i would like to find a flight from charlotte to las vegas that makes a stop in st. louis",
                "intent": "flight",
                "entities": [
                    {
                        "start": 35,
                        "end": 44,
                        "value": "charlotte",
                        "entity": "fromloc.city_name"
                    },
                    {
                        "start": 48,
                        "end": 57,
                        "value": "las vegas",
                        "entity": "toloc.city_name"
                    },
                    {
                        "start": 79,
                        "end": 88,
                        "value": "st. louis",
                        "entity": "stoploc.city_name"
                    }
                ]
            },
            ...
        ]
    }
}

Summary of Data

Sample Number Vocabulary Size Number of Slots Number of Intents
4978(Training set)+893(Testing set) 943 129 26

Sample Code

summary_data.py include codes to read data from raw data file,user can learn how to read data.

Download Data

Data Format Training Set Testing Set
Python 3 Pickle Format atis.train.pkl atis.test.pkl
Rasa NLU JSON Format train.json test.json

Credit

Similar Projects