Linguist 278: Programming for Linguists
Stanford Linguistics, Fall 2021
Christopher Potts

Class 12: Pandas exercises

In [ ]:
%matplotlib inline
import os
import pandas as pd

Toy movie dataset

Read in the file with the 'Movie' column as the index

The file is here:

https://web.stanford.edu/class/linguist278/data/movie-data.csv

In [ ]:
## TO BE COMPLETED ##
pass

View the top 2 lines of the DataFrame

In [ ]:
## TO BE COMPLETED ##
pass

Extract the 'Worldwide Gross in Dollars' column as a Series

In [ ]:
## TO BE COMPLETED ##
pass

Extract the row for 'Lost in Translation'

In [ ]:
## TO BE COMPLETED ##
pass

Sort the DataFrame based on index

In [ ]:
## TO BE COMPLETED ##
pass

Convert the 'Worldwide Gross in Dollars' values to float using apply

In [ ]:
## TO BE COMPLETED ##
pass

Create a horizonal barplot of the converted 'Worldwide Gross in Dollars' values

In [ ]:
## TO BE COMPLETED ##
pass

Concreteness lexicon

Read in the file with the 'Word' column as the index

In [ ]:
concreteness_url = ('http://web.stanford.edu/class/linguist278/data/'
                    'Concreteness_ratings_Brysbaert_et_al_BRM.csv')
In [ ]:
## TO BE COMPLETED ##
pass

Get the max value in the 'Conc.M' column

In [ ]:
## TO BE COMPLETED ##
pass

Sort the frame based primarily on 'Dom_Pos' and secondarily on 'Percent_known'

In [ ]:
## TO BE COMPLETED ##
pass

Get the subframe of rows whose 'Conc.M' is equal to the max of these values

In [ ]:
## TO BE COMPLETED ##
pass

Use apply to remap 'Bigram'

Change 0 to 'unigram' and 1 to 'bigram'.

In [ ]:
## TO BE COMPLETED ##
pass

Use groupby and apply to get the mean 'Conc.M' for each 'Dom_Pos'

In [ ]:
## TO BE COMPLETED ##
pass