NAME

Data::Frame::Examples - Example data sets

VERSION

version 0.0053

SYNOPSIS

    use Data::Frame::Examples qw(:datasets dataset_names);

    my $datasets = dataset_names();    # names of all example datasets

    my $mtcars = mtcars();

DESCRIPTION

Example datasets as Data::Frame objects.

Checkout Data::Frame::Examples::dataset_names() for an array of example datasets provided by this module.

FUNCTIONS

dataset_names

Returns an array of names of the datasets in this module.

DATASETS

airquality

A dataset with 154 observations on 6 variables, for daily readings of the following air quality values for May 1, 1973 to September 30, 1973.

The variables are,

  • Ozone

    numeric Ozone (ppb)

  • Solar_R

    numeric Solar R (lang)

  • Wind

    numeric Wind (mph)

  • Temp

    numeric Temperature (degrees F)

  • Month

    numeric Month (1-12)

  • Day

    numeric Day of month (1-31)

diamonds

A dataset containing the prices and other attributes of almost 53,940 diamonds on 10 variables.

The variables are,

  • price

    price in US dollars

  • carat

    weight of the diamond

  • cut

    quality of the cut (Fair, Good, Very Good, Premium, Ideal)

  • color

    diamond colour, from J (worst) to D (best)

  • clarity

    a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))

  • x

    length in mm

  • y

    width in mm

  • z

    depth in mm

  • depth

    total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)

  • table

    width of top of diamond relative to widest point

economics

A dataset with 574 rows and 6 variables, produced from US economic time series data available from http://research.stlouisfed.org/fred2.

The variables are,

  • date

    Month of data collection

  • psavert

    personal saving rate

  • pce

    personal consumption expenditures, in billions of dollars

  • unemploy

    number of unemployed in thousands

  • uempmed

    median duration of unemployment, in weeks

  • pop

    total population, in thousands

economics_long

A dataset with 2870 rows and 4 variables.

It's from the same data source as economics, except that economics is in "wide" format, this economics_long is in "long" format.

faithfuld

A 2d density estimate of the waiting and eruptions variables data faithful. 5,625 observations and 3 variables.

iris

A dataset with 150 cases and 5 variables, for 50 flowers from each of 3 species of iris.

The variables are,

  • Sepal_Length

  • Sepal_Width

  • Petal_Length

  • Petal_Width

  • Species

    The species are setosa, versicolor, and virginica.

mpg

A subset of the fuel economy data that the EPA makes available on http://fueleconomy.gov. 234 rows and 11 variables.

The variables are,

  • manufacturer

  • model

    model name

  • displ

    Engine displacement, in litres

  • year

    year of manufacture

  • cyl

    number of cylinders

  • trans

    type of transmission

  • drv

    f = front-wheel drive, r = rear wheel drive, 4 = 4wd

  • cty

    city miles per gallon

  • hwy

    highway miles per gallon

  • fl

    fuel type

  • class

    "type" of car

mtcars

Data extracted from the 1974 Motor Trend US magazine, for 32 automobiles (1973-74 models). 32 observations on 11 variables.

The variables are,

  • mpg

    Miles/(US) gallon

  • cyl

    Number of cylinders

  • disp

    Displacement (cu.in.)

  • hp

    Gross horsepower

  • drat

    Rear axle ratio

  • wt

    Weight (1000 lbs)

  • qseq

    1/4 mile time

  • vs

    V/S

  • am

    Transmission (0 = automatic, 1 = manual)

  • gear

    Number of forward gears

  • carb

    Number of carburetors

txhousing

Information about the housing market in Texas provided by the TAMU real estate center, http://recenter.tamu.edu/. 8602 observations and 9 variables.

The variables are,

  • city

    Name of MLS area

  • year,month,date

  • sales

    Number of sales

  • volume

    Total value of sales

  • median

    Median sale price

  • listings

    Total active listings

  • inventory

    "Months inventory": amount of time it would take to sell all current listings at current pace of sales.

SEE ALSO

Data::Frame

AUTHORS

  • Zakariyya Mughal <zmughal@cpan.org>

  • Stephan Loyd <sloyd@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014, 2019 by Zakariyya Mughal, Stephan Loyd.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.