Python-3.4.3/C2/Loading-Data-From-Files/English

From Script | Spoken-Tutorial
Jump to: navigation, search

Loading data from files - English


Title of script: Loading data files from files

Author: Ankit

Keywords: video tutorial, read data, files, loadtxt, ipython


Visual Cue
Narration
Show Slide

containing title, name of the production team along with the logo of MHRD

Hello Friends and Welcome to the spoken tutorial on "loading data from files".
Show Slide

Learning objectives

Read data from files, which contain data in:

  • Single column format
  • Multiple columns separated by spaces or other delimiters
In this tutorial you will learn to,
  • Read data from files, which contain data in:
    • Single column format
    • Or multiple columns separated by spaces or other delimiters
Show Slide

System Requirements

  • Ubuntu Linux 14.04
To record this tutorial, I am using
  • Ubuntu Linux 14.04 operating system
  • Python 3.4.3
  • IPython 5.1.0
Show Slide

Pre-requisites

You should know how to run basic Python commands on the ipython console.

If not, for relevant Python tutorials, please visit this website http://spoken-tutorial.org

[Terminal]

ipython3

Let us first open the Terminal by pressing Ctrl+Alt+T keys simultaneously.

Now, type ipython3 and press Enter.

[IPython console]

%pylab and press Enter.

Let us initialise the pylab package.

Type percent pylab and press Enter.

[IPython Terminal]

Type

cat primes.txt

and press Enter

Let us begin with reading the file primes.txt. This file contains a list of prime numbers listed in a column.

Type cat(space)primes(dot)txt

We can use the cat command to fetch data from the file and display it on the terminal.

press Enter

We see the prime numbers are displayed on the terminal.

[IPython Terminal]

Type primes = loadtxt(“primes.txt”)

and press Enter

Now we can use the loadtxt() command to store this list into the variable primes.


Type primes(equal to)loadtxt(within parentheses)(within double quotes)primes(dot)txt and press Enter.

Please make sure that you provide the correct path to the file, 'primes.txt'.

The file, in our case, is present in the home folder.

[IPython Terminal]

Type

print(primes) and press Enter

primes is now a sequence of prime numbers, that was listed in the file, primes.txt.

Now let us display the contents in the variable primes.

So, type, print (within parentheses) primes and press Enter.

We see the sequence printed.

[IPython Terminal]

Highlight the output on the terminal

We observe that all the numbers end with a period ‘.’.

This is because all these numbers are floats.

[IPython Terminal]

Type

cat pendulum.txt

Highlight Cat

and press Enter

Type cat(space)pendulum(dot)txt

Press Enter.

This file contains two columns of data.

This first column contains the length of the pendulum.

The second column contains the corresponding time period.

[IPython Terminal]

Type

pend=loadtxt(“pendulum.txt”)

and press Enter

Let us now read the data from the file into the variable pend using the loadtxt command.

Type pend(equal to)loadtxt(within parentheses)(within double quotes)pendulum(dot)txt and press Enter.

Please note that loadtxt needs both the columns of the file to have equal number of rows.
[IPython Terminal]

print(pend)

and press Enter

Now print the variable pend to see what it contains.

Type print(within parentheses)pend and press Enter.

[IPython Terminal]

Type L,T=loadtxt(“pendulum.txt”, unpack=True)

and press Enter

Notice that variable has two sequences containing two columns of the data file.

Let us use an additional argument of the loadtxt command to read the data into two separate sequences.

Type L(comma)T(equal to)loadtxt(within parentheses within double quotes)pendulum(dot)txt(after double quotes comma)unpack(equal to)True

And press Enter.

[IPython Terminal]

Type print(L)

print(T)

and press Enter

Now print the variables L and T, to see what they contain.

Type print(within parentheses)L and press Enter

Type print(within parentheses)T and press Enter

Highlight the output Notice, that L and T now contain the first and second columns of data from the pendulum.txt respectively

unpack(equal to)True has made the two columns into two separate and simple sequences.

Show Slide 4

Assignment 1

Pause the video over here and try out the following exercise and resume the video.
  • Read the data from the file pendulum(underscore)semicolon(dot)txt.
  • This file contains data in two columns. These columns are separated by semicolons.
  • Use the IPython help to see how to do this.
[IPython Terminal]

Type

L, T = loadtxt(“pendulum_semicolon.txt”, unpack=True,delimiter=”;”)


Let us look at the solution. Switch to the terminal.

First we will see the content of the file.

So type cat space pendulum(underscore)semicolon(dot)txt

Press Enter

We see the two columns separated by a semi-colon.

Now, type L(comma)T(equal to)loadtxt (within parentheses within double quotes) pendulum(underscore)semicolon (dot)txt(after double quotes comma)unpack(equal to)True(comma)delimiter(equal to)(within double quotes)semicolon.

And press Enter.

[IPython Terminal]

Type

print(L)

and press Enter

[IPython Terminal]

Type

print(T)

and press Enter

Then Type print(within parentheses)L and press Enter.


print(within parentheses)T and press Enter.


This will display the contents inside the two variables L and T.

Summary slide This brings us to the end of this tutorial. In this tutorial, we have learnt:

To read data from files using the loadtxt() command.

The data can be in

  • A single column format
  • Or multiple column format, separated by spaces or other delimiters.
Evaluation slide Here are some self assessment questions for you to solve
  1. loadtxt can read data only from a file with one column. Is it True or False?
  2. Given a file data.txt with three columns of data separated by spaces.

Read it into 3 separate simple sequences.

Evaluation slide
  1. Given a file data.txt with three columns of data separated by colon.

Read it into 3 separate simple sequences.

Slide

Solution 1:

Now let us look at the answers,

The answer to the first question is False.

The loadtxt() command can read data from files having single columns as well as multiple columns.

Slide

Solution 2:

The answer to the second question is,

To separate data into three columns, we use the loadtxt() command as follows:

x(equal to)loadtxt(within parentheses and within double quotes)data(dot)txt(after double quotes comma)unpack(equal to)True

Slide


Solution 3:

The answer to the third question is,

We read into three separate sequences by using an additional argument of delimiter in the loadtxt command.

x(equal to)loadtxt( within parentheses, within double quotes)data(dot)txt(after double quotes comma)unpack(equal to)True(comma)delimiter(equal to)(within double quotes)semicolon

Show Slide

Forum

Do you have questions on THIS Spoken Tutorial?

Choose the minute and second where you have the question.

Explain your question briefly.

Someone from the FOSSEE team will answer them.

Please visit this site.

Show Slide

Fossee Forum

Do you have any general / technical questions?

Please visit the forum given in the link.

Show Slide

Textbook Companion

The FOSSEE team coordinates coding of solved examples of popular books.

We give honorarium and certificates for those who do this.

For more details, please visit this site.

Show Slide

Acknowledgment

The Spoken Tutorial project is funded by NMEICT, MHRD, Govt. of India
Show Slide

Thank You

This is Prathamesh Salunke from IIT Bombay signing off. Thanks for watching.

Contributors and Content Editors

Nancyvarkey, Nirmala Venkat, Pratham920