Skip to content

2 Creating an ESPRESSO data object

Joses W. Ho edited this page Sep 13, 2018 · 2 revisions

Load Libraries

Before you perform analysis, you need to load the espresso package.

In [1]: import espresso as esp

Creating an espresso object

CRITTA produces a MetaData file, a FeedLog file, and a FeedStats file.

They should have the respective filenames:

  • MetaData_<Date>_<TimeStarted>_<OptionalAnnotation>.csv
  • FeedLog_<Date>_<TimeStarted>_<OptionalAnnotation>.csv
  • FeedLog_<Date>_<TimeStarted>_<OptionalAnnotation>.csv

As the name suggests, the MetaData will contain all the metadata of each fly/animal in the experiment. The FeedLog contains a range of details for each feed. The FeedStats contains the count of feeds per minute, for each chamber.

The first thing to note is that matching files from the same experiment will have exactly the same Date and TimeStarted.

The second thing is note is that during the loading of data, matching CSVs must be in the same folder.

To begin analyzing data, you start by creating an espresso object.

This is done with the espresso function. It requires you to enter:

  • folder: the directory where matching CSVs are located, and

  • expt_duration_minutes: the actual duration of the experiment. If you were running a six-hour experiment, you would this enter "360". (Even if you are just analyzing the first two hours, for example, you will still need to enter the actual duration the experiment ran for.)

In [2]: path_to_data = '/Users/josesho/ESPRESSO-data/experiment-1' 
        # Replace this path with the actual folder containing the metadata and feedlog.
        # For Windows users, replace each '/' with '//'.

In [3]: my_expt = esp.espresso(folder=path_to_data, 
                               expt_duration_minutes=360) # For a six-hour experiment.

If, within path_to_data, it cannot identify a matching MetaData file for a given FeedLog CSV, it will raise a NameError.

Saving and loading espresso objects

After creating an espresso object, you can save it using the save method.

In [4]: my_expt.save('/Users/josesho/ESPRESSO-data/experiment-1.espresso')
        # Replace the path within quoataion marks with the actual filename and path.

This uses Python's inbuilt pickle library, and is designed to be backward-compatible with Python 2.x (It is highly recommended, however, to use Python 3.)

You can load a saved espresso object using the load method.

In [5]: another_expt = esp.load('path/to/file')

Getting details about a loaded espresso object.

Having successfully loaded at least one matching MetaData-FeedLog CSV pair, you can print out details about your experiment by simply executing the name of your ESPRESSO experiment. Running

In [6]: my_expt

should give something like

4 feedlogs with a total of 120 flies.

3 Genotypes [w1118;MB312B-Gal4, MB312B-Gal4>UAS-TrpA1, w1118;UAS-TrpA1]
Categories (3, object): [w1118;MB312B-Gal4 < w1118;UAS-TrpA1 < MB312B-Gal4>UAS-TrpA1].

2 Status types [Sibling, Offspring]
Categories (2, object): [Sibling < Offspring].

2 Temperatures [22, 29]
Categories (2, int64): [22 < 29].

2 FoodChoices [100mM_Sucrose, 100mM_Sucrose_100mM_Arabinose]
Categories (2, object): [100mM_Sucrose < 100mM_Sucrose_100mM_Arabinose].

1 Sex type [M]
Categories (1, object): [M].

1 type of FlyCountInChamber [1]
Categories (1, int64): [1].

Total experiment duration = 360 minutes

ESPRESSO v0.5.0

Try loading some of your own data. You can load MetaData-FeedLog CSV pairs from different experiments and dates, as long as they are in the same folder, and as long as each FeedLog CSV has its matching MetaData CSV.

Attributes of an espresso object.

You can retrieve the original MetaData as a pandas DataFrame with

In [7]: my_expt.flies
Genotype	Sex	Minimum Age	Maximum Age	Tube1	Tube2	Temperature	FlyCountInChamber	FlyID	AtLeastOneFeed
0	w1118;MB312B-Gal4	M	5	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly1	True
1	w1118;MB312B-Gal4	M	5	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly2	True
2	w1118;MB312B-Gal4	M	5	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly3	True
3	w1118;MB312B-Gal4	M	5	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly4	True
4	w1118;MB312B-Gal4	M	5	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly5	True
120 rows × 10 columns

The same can be done for the FeedLog with

In [8]: my_expt.feeds
StartTime	StartFrame	FeedTubeIdx	FlyID	ChoiceIdx	AviFile	FeedVol_µl	FeedDuration_ms	Evap-mm3/s	Valid	...	Maximum Age	Tube1	Tube2	Temperature	FlyCountInChamber	AverageFeedVolumePerFly_µl	AverageFeedCountPerFly	AverageFeedSpeedPerFly_µl/s	FoodChoice	FeedLog_rawfile
532	NaN	NaN	0	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly1	0	NIL	NaN	NaN	NaN	False	...	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	NaN	0.0	NaN	100mM_Sucrose	FeedLog_2017-03-31_13-42-10-MB312B-Gal4-UAS-Tr...
534	NaN	NaN	1	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly1	1	NIL	NaN	NaN	NaN	False	...	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	NaN	0.0	NaN	100mM_Sucrose_100mM_Arabinose	FeedLog_2017-03-31_13-42-10-MB312B-Gal4-UAS-Tr...
492	14/03/2017 13:36:26.081	4133459.0	0	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly1	0	FeedEvent_0_2017-03-14_13-36-26.081.avi	0.237945	68506.0	0.000023	True	...	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	0.237945	1.0	0.003473	100mM_Sucrose	FeedLog_2017-03-31_13-42-10-MB312B-Gal4-UAS-Tr...
493	14/03/2017 13:40:27.382	4140719.0	0	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly1	0	FeedEvent_0_2017-03-14_13-40-27.382.avi	0.001197	8576.0	0.000086	True	...	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	0.001197	1.0	0.000140	100mM_Sucrose	FeedLog_2017-03-31_13-42-10-MB312B-Gal4-UAS-Tr...
494	14/03/2017 13:40:37.918	4141036.0	0	2017-03-14_13-17-28-MB312B-Gal4-UAS-TrpA1-22_Fly1	0	FeedEvent_0_2017-03-14_13-40-37.918.avi	0.001037	3424.0	0.000086	True	...	7	100mM_Sucrose	100mM_Sucrose_100mM_Arabinose	22	1	0.001037	1.0	0.000303	100mM_Sucrose	FeedLog_2017-03-31_13-42-10-MB312B-Gal4-UAS-Tr...
4295 rows × 28 columns

Each row of the .feeds DataFrame represents a single feed event, with the time, volume, and duration recorded. The metadata for each fly is also recorded.

Note: You will notice (as above) that some feeds have NaN in their FeedVol_µl and FeedDuration_ms columns. These feeds are padding, and should have feed times as either 0.5 seconds or as 6 hr 4 min 49 sec. These false feed events are included as padding to ensure that non-feeding flies and food choices that were not selected can be identified in further plots (e.g. contrast plots).