Linguist 278: Programming for Linguists
Stanford Linguistics, Fall 2021
Christopher Potts

Class 12: Pandas exercises¶

In [ ]:

%matplotlib inline
import os
import pandas as pd

Contents¶

Toy movie dataset
Concreteness lexicon

Toy movie dataset¶

Read in the file with the 'Movie' column as the index¶

The file is here:

https://web.stanford.edu/class/linguist278/data/movie-data.csv

In [ ]:

## TO BE COMPLETED ##
pass

View the top 2 lines of the DataFrame¶

In [ ]:

## TO BE COMPLETED ##
pass

Extract the 'Worldwide Gross in Dollars' column as a Series¶

In [ ]:

## TO BE COMPLETED ##
pass

Extract the row for 'Lost in Translation'¶

In [ ]:

## TO BE COMPLETED ##
pass

Sort the DataFrame based on index¶

In [ ]:

## TO BE COMPLETED ##
pass

Convert the 'Worldwide Gross in Dollars' values to float using apply¶

In [ ]:

## TO BE COMPLETED ##
pass

Create a horizonal barplot of the converted 'Worldwide Gross in Dollars' values¶

In [ ]:

## TO BE COMPLETED ##
pass

Concreteness lexicon¶

Read in the file with the 'Word' column as the index¶

In [ ]:

concreteness_url = ('http://web.stanford.edu/class/linguist278/data/'
                    'Concreteness_ratings_Brysbaert_et_al_BRM.csv')

In [ ]:

## TO BE COMPLETED ##
pass

Get the max value in the 'Conc.M' column¶

In [ ]:

## TO BE COMPLETED ##
pass

Sort the frame based primarily on 'Dom_Pos' and secondarily on 'Percent_known'¶

In [ ]:

## TO BE COMPLETED ##
pass

Get the subframe of rows whose 'Conc.M' is equal to the max of these values¶

In [ ]:

## TO BE COMPLETED ##
pass

Use apply to remap 'Bigram'¶

Change 0 to 'unigram' and 1 to 'bigram'.

In [ ]:

## TO BE COMPLETED ##
pass

Use groupby and apply to get the mean 'Conc.M' for each 'Dom_Pos'¶

In [ ]:

## TO BE COMPLETED ##
pass