Willkommen bei Scribd!

Introduction To The R Language: Reading and Writing Data

Hochgeladen von

0% fanden dieses Dokument nützlich (0 Abstimmungen)

34 Ansichten9 Seiten

R has functions for reading data into R like read.table and read.csv which can read tabular data from files with arguments like file path, header, separator, and column classes. It also has functions for writing out like write.table and dump. The read.table function is commonly used and has arguments like file name, header, separator, number of rows, and comment character. Specifying column classes can improve performance of reading larger datasets with read.table. It's also important to know system specifications like available memory for working with large datasets in R.

Originalbeschreibung:

Originaltitel

Slides Reading Data1

Copyright

Verfügbare Formate

PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

34 Ansichten9 Seiten

Introduction To The R Language: Reading and Writing Data

Hochgeladen von

areacode212

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 9

Im Dokument suchen

Introduction to the R Language

Reading and Writing Data

Computing for Data Analysis

1 / 22

Reading Data

There are a few principal functions reading data into R. read.table, read.csv, for reading tabular data readLines, for reading lines of a text le source, for reading in R code les (inverse of dump) dget, for reading in R code les (inverse of dput) load, for reading in saved workspaces unserialize, for reading single R objects in binary form

2 / 22

Writing Data

There are analogous functions for writing data to les write.table writeLines dump dput save serialize

3 / 22

Reading Data Files with read.table

The read.table function is one of the most commonly used functions for reading data. It has a few important arguments: file, the name of a le, or a connection header, logical indicating if the le has a header line sep, a string indicating how the columns are separated colClasses, a character vector indicating the class of each column in the dataset nrows, the number of rows in the dataset comment.char, a character string indicating the comment character skip, the number of lines to skip from the beginning stringsAsFactors, should character variables be coded as factors?

4 / 22

read.table
For small to moderately sized datasets, you can usually call read.table without specifying any other arguments data <- read.table("foo.txt") R will automatically skip lines that begin with a # gure out how many rows there are (and how much memory needs to be allocated) gure what type of variable is in each column of the table Telling R all these things directly makes R run faster and more eciently. read.csv is identical to read.table except that the default separator is a comma.
5 / 22

Reading in Larger Datasets with read.table

With much larger datasets, doing the following things will make your life easier and will prevent R from choking. Read the help page for read.table, which contains many hints Make a rough calculation of the memory required to store your dataset. If the dataset is larger than the amount of RAM on your computer, you can probably stop right here. Set comment.char = "" if there are no commented lines in your le.

6 / 22

Reading in Larger Datasets with read.table

Use the colClasses argument. Specifying this option instead of using the default can make read.table run MUCH faster, often twice as fast. In order to use this option, you have to know the class of each column in your data frame. If all of the columns are numeric, for example, then you can just set colClasses = "numeric". A quick an dirty way to gure out the classes of each column is the following: initial <- read.table("datatable.txt", nrows = 100) classes <- sapply(initial, class) tabAll <- read.table("datatable.txt", colClasses = classes) Set nrows. This doesnt make R run faster but it helps with memory usage. A mild overestimate is okay. You can use the Unix tool wc to calculate the number of lines in a le.
7 / 22

Know Thy System

In general, when using R with larger datasets, its useful to know a few things about your system. How much memory is available? What other applications are in use? Are there other users logged into the same system? What operating system? Is the OS 32 or 64 bit?

8 / 22

Calculating Memory Requirements

I have a data frame with 1,500,000 rows and 120 columns, all of which are numeric data. Roughly, how much memory is required to store this data frame? 1, 500, 000 120 8 bytes/numeric = 1440000000 bytes = 1, 373.29 MB = 1.34 GB

= 1440000000/220 bytes/MB

9 / 22

Das könnte Ihnen auch gefallen

How To Choose A Micro Controller
Dokument22 Seiten
How To Choose A Micro Controller
GAN SA KI
100% (1)
SAS TXT Import
Dokument13 Seiten
SAS TXT Import
udayraj_v
Noch keine Bewertungen
Tableau Course Tableau
Dokument21 Seiten
Tableau Course Tableau
paramp12900
Noch keine Bewertungen
Learn C++
Von Everand
Learn C++
Durgesh
Bewertung: 4.5 von 5 Sternen
4.5/5 (9)
Mastering Data Structures and Algorithms in C and C++
Von Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
Noch keine Bewertungen
Beginners Guide To Oracle EBS
Dokument5 Seiten
Beginners Guide To Oracle EBS
Swapnil Yeole
Noch keine Bewertungen
R Tutorial
Dokument26 Seiten
R Tutorial
kulchi
Noch keine Bewertungen
Matlab - Crash Course in Matlab - Found - at - (Redsamara - Com) PDF
Dokument59 Seiten
Matlab - Crash Course in Matlab - Found - at - (Redsamara - Com) PDF
osmanatam
Noch keine Bewertungen
Unit3 160420200647 PDF
Dokument146 Seiten
Unit3 160420200647 PDF
Arunkumar Gopu
Noch keine Bewertungen
Essential Algorithms: A Practical Approach to Computer Algorithms
Von Everand
Essential Algorithms: A Practical Approach to Computer Algorithms
Rod Stephens
Bewertung: 4.5 von 5 Sternen
4.5/5 (2)
Allen Bradley PLC
Dokument32 Seiten
Allen Bradley PLC
chayan_m_shah
Noch keine Bewertungen
MSOFTX3000 ATCA Platform Data Configuration of Hardware and Module Groups-20090622-B-1.0
Dokument51 Seiten
MSOFTX3000 ATCA Platform Data Configuration of Hardware and Module Groups-20090622-B-1.0
haned
Noch keine Bewertungen
Reading in Larger Datasets With Read - Table Reading in Larger Datasets With Read - Table
Dokument4 Seiten
Reading in Larger Datasets With Read - Table Reading in Larger Datasets With Read - Table
RustEd
Noch keine Bewertungen
Input: 1.1. Assignment
Dokument6 Seiten
Input: 1.1. Assignment
HalifatulKamelia
Noch keine Bewertungen
R Tutorial
Dokument119 Seiten
R Tutorial
Prithwish Ghosh
Noch keine Bewertungen
Sapabapmaterial 140803141350 Phpapp02 PDF
Dokument454 Seiten
Sapabapmaterial 140803141350 Phpapp02 PDF
SrinivasDukka
Noch keine Bewertungen
Programming Pearls - Self-Describing Data
Dokument5 Seiten
Programming Pearls - Self-Describing Data
Vikash Singh
Noch keine Bewertungen
Working Draft Working Draft Working Draft
Dokument72 Seiten
Working Draft Working Draft Working Draft
Bhai Bai
Noch keine Bewertungen
Instructions For Using R To Create Predictive Models v5
Dokument17 Seiten
Instructions For Using R To Create Predictive Models v5
Giovanni Firrincieli
Noch keine Bewertungen
R Lectures
Dokument10 Seiten
R Lectures
Apam Benjamin
Noch keine Bewertungen
Data Structures and Algorithms
Dokument41 Seiten
Data Structures and Algorithms
Matthew Tedunjaiye
Noch keine Bewertungen
Introduction To DATA STRUCTURES: Basic Principles
Dokument4 Seiten
Introduction To DATA STRUCTURES: Basic Principles
Abhi Mahajan
Noch keine Bewertungen
Xcms
Dokument11 Seiten
Xcms
Khurram Rajput
Noch keine Bewertungen
Day 2 Red Shift Arch
Dokument3 Seiten
Day 2 Red Shift Arch
Shyama
Noch keine Bewertungen
LaF Manual
Dokument9 Seiten
LaF Manual
manuelq9
Noch keine Bewertungen
Howtouser: 1 What Is R
Dokument6 Seiten
Howtouser: 1 What Is R
Li Ann
Noch keine Bewertungen
DataStage Stages 12-Dec-2013 12PM
Dokument47 Seiten
DataStage Stages 12-Dec-2013 12PM
nithinmamidala999
Noch keine Bewertungen
R Fundamentals (Hadley Wickham - Rice Univ)
Dokument66 Seiten
R Fundamentals (Hadley Wickham - Rice Univ)
urcanfleur
Noch keine Bewertungen
Resumo Postgresql
Dokument6 Seiten
Resumo Postgresql
Samuel Rodriguez
Noch keine Bewertungen
Journal of Statistical Software: Reviewer: Samuel E. Buttrey Naval Postgraduate School
Dokument4 Seiten
Journal of Statistical Software: Reviewer: Samuel E. Buttrey Naval Postgraduate School
Bom Villatuya
Noch keine Bewertungen
Exploratory Data Analysis and Graphics: Lab 2
Dokument19 Seiten
Exploratory Data Analysis and Graphics: Lab 2
juntujuntu
Noch keine Bewertungen
BD Seminar
Dokument23 Seiten
BD Seminar
kavya
Noch keine Bewertungen
Viva 1
Dokument21 Seiten
Viva 1
arefk
Noch keine Bewertungen
Good Afternoon To Everyone
Dokument4 Seiten
Good Afternoon To Everyone
Jayasimha Madhira
Noch keine Bewertungen
Assign 1 AIOU Data Structure
Dokument22 Seiten
Assign 1 AIOU Data Structure
Fawad Ahmad Jatt
Noch keine Bewertungen
Reading Delimited Text Files Into SAS®9: Technical Paper
Dokument10 Seiten
Reading Delimited Text Files Into SAS®9: Technical Paper
Bhuvan Malik
Noch keine Bewertungen
ASCII Stands For American Standard Code For Information Interchange
Dokument3 Seiten
ASCII Stands For American Standard Code For Information Interchange
bad_zurbic
Noch keine Bewertungen
Radix Sort: Problem Description
Dokument5 Seiten
Radix Sort: Problem Description
x_jain
Noch keine Bewertungen
Lab Handout - Tableau 01 - Beyond Basic Visualization v02
Dokument14 Seiten
Lab Handout - Tableau 01 - Beyond Basic Visualization v02
hammad
Noch keine Bewertungen
Edge Computing
Dokument5 Seiten
Edge Computing
Anastasia Wagner
Noch keine Bewertungen
A Tour of The Cool Support Code: 1995-2016 by Alex Aiken. All Rights Reserved
Dokument9 Seiten
A Tour of The Cool Support Code: 1995-2016 by Alex Aiken. All Rights Reserved
Houssem Nasri
Noch keine Bewertungen
WBUT Data C Book
Dokument587 Seiten
WBUT Data C Book
Subhajit Chakraborty
Noch keine Bewertungen
Differentiate Between Data Type and Data Structures
Dokument11 Seiten
Differentiate Between Data Type and Data Structures
krishnakumar
Noch keine Bewertungen
Instructions 1209457942424462
Dokument11 Seiten
Instructions 1209457942424462
Trance
Noch keine Bewertungen
Cam SQL Lab File
Dokument26 Seiten
Cam SQL Lab File
nancie3
Noch keine Bewertungen
Lecture 2
Dokument19 Seiten
Lecture 2
bluepixarlamp
Noch keine Bewertungen
Pandas For Machine Learning: Acadview
Dokument18 Seiten
Pandas For Machine Learning: Acadview
Yash Bansal
Noch keine Bewertungen
Computing Database Size: Dvanced Echnical Ocumentation
Dokument10 Seiten
Computing Database Size: Dvanced Echnical Ocumentation
Tubagus Purworusmiardi
Noch keine Bewertungen
R Tutorial
Dokument26 Seiten
R Tutorial
arshad121233
Noch keine Bewertungen
S&P Capital Interview Questions
Dokument1 Seite
S&P Capital Interview Questions
Nishtha Rai
Noch keine Bewertungen
The NET File Format
Dokument17 Seiten
The NET File Format
seifadiaz
Noch keine Bewertungen
Data Table PDF
Dokument102 Seiten
Data Table PDF
Cathleen Gilmore
Noch keine Bewertungen
Salut Macro
Dokument41 Seiten
Salut Macro
lijika
Noch keine Bewertungen
CPE 202 Lecture Notes
Dokument41 Seiten
CPE 202 Lecture Notes
Samuel jidayi
Noch keine Bewertungen
Notes 20221016
Dokument24 Seiten
Notes 20221016
Gluvenis L.
Noch keine Bewertungen
Chapter 13: Query Processing: Database System Concepts, 5th Ed
Dokument55 Seiten
Chapter 13: Query Processing: Database System Concepts, 5th Ed
dcse_rec
Noch keine Bewertungen
R Basics Continued - Factors and Data Frames - Intro To R and RStudio For Genomics
Dokument17 Seiten
R Basics Continued - Factors and Data Frames - Intro To R and RStudio For Genomics
Cleaver Bright
Noch keine Bewertungen
JG HG J HHGH 76765776
Dokument12 Seiten
JG HG J HHGH 76765776
GlobalStrategy
Noch keine Bewertungen
Viva Voce
Dokument5 Seiten
Viva Voce
gangwar098preeti
Noch keine Bewertungen
Chapter 13: Query Processing
Dokument49 Seiten
Chapter 13: Query Processing
Vysakh Sreenivasan
Noch keine Bewertungen
DBMS Lab Manual
Dokument65 Seiten
DBMS Lab Manual
Shiva kannan
Noch keine Bewertungen
Word Processor Data STR
Dokument29 Seiten
Word Processor Data STR
Arpit Tyagi
Noch keine Bewertungen
Coding In C Decoded: Decoded, #1
Von Everand
Coding In C Decoded: Decoded, #1
D. Brown
Noch keine Bewertungen
Magic Data: Part 1 - Harnessing the Power of Algorithms and Structures
Von Everand
Magic Data: Part 1 - Harnessing the Power of Algorithms and Structures
Chuck Sherman
Noch keine Bewertungen
Lec 3
Dokument24 Seiten
Lec 3
Anonymous any
Noch keine Bewertungen
Jjjduiii Js
Dokument13 Seiten
Jjjduiii Js
jaarit
Noch keine Bewertungen
V7IMSpost Installation Guide PC
Dokument37 Seiten
V7IMSpost Installation Guide PC
Truonglana
Noch keine Bewertungen
Jade Programming Tutorial
Dokument20 Seiten
Jade Programming Tutorial
Chu Văn Nam
Noch keine Bewertungen
Week 4 - What Is Olap
Dokument5 Seiten
Week 4 - What Is Olap
Angelique Garrido V
Noch keine Bewertungen
Mapping Models To Code
Dokument33 Seiten
Mapping Models To Code
tokhi
100% (1)
Maxbox Starter53 Realtime UML
Dokument8 Seiten
Maxbox Starter53 Realtime UML
Max Kleiner
Noch keine Bewertungen
Top 10 Strategies For Oracle Performance Part 3
Dokument10 Seiten
Top 10 Strategies For Oracle Performance Part 3
RAMON CESPEDES PAZ
Noch keine Bewertungen
Visirule 5.0 Reference
Dokument114 Seiten
Visirule 5.0 Reference
Malkeet Singh
Noch keine Bewertungen
BDC Sap Abap
Dokument39 Seiten
BDC Sap Abap
Raj Rajesh
Noch keine Bewertungen
Solid Edge Overview: Siemens PLM Software
Dokument3 Seiten
Solid Edge Overview: Siemens PLM Software
parthasutradhar
Noch keine Bewertungen
Trivia in Empowerment Tech
Dokument6 Seiten
Trivia in Empowerment Tech
Stephanie Sundiang
Noch keine Bewertungen
Fluke View Forms Manual
Dokument36 Seiten
Fluke View Forms Manual
Samsudin Ahmad
Noch keine Bewertungen
3330 Winter 2007 Final Exam With Solutions
Dokument11 Seiten
3330 Winter 2007 Final Exam With Solutions
maxz
Noch keine Bewertungen
CSS NC II - Diagnostic
Dokument2 Seiten
CSS NC II - Diagnostic
CherryFlorBolanioGenita
Noch keine Bewertungen
Pre-Installation Steps: Ubuntu-Linux Installation Tips
Dokument2 Seiten
Pre-Installation Steps: Ubuntu-Linux Installation Tips
gpsmlk
Noch keine Bewertungen
Abstract Data Types
Dokument28 Seiten
Abstract Data Types
tamirat
Noch keine Bewertungen
Sample Exam
Dokument2 Seiten
Sample Exam
seviyor
Noch keine Bewertungen
15cs338e-15cs338e-Database Security and Privacy - MCQ
Dokument6 Seiten
15cs338e-15cs338e-Database Security and Privacy - MCQ
algatesgiri
0% (1)
Xpadmin
Dokument17 Seiten
Xpadmin
firedigital
Noch keine Bewertungen
Rms 1314 Ig
Dokument130 Seiten
Rms 1314 Ig
Varun
Noch keine Bewertungen
PASW Statistics 18 DM Book
Dokument2.543 Seiten
PASW Statistics 18 DM Book
Alam Yien
Noch keine Bewertungen
DGCA RPAS Guidance Manual PDF
Dokument57 Seiten
DGCA RPAS Guidance Manual PDF
Saransh Lakra
Noch keine Bewertungen
Intro SDL Suite
Dokument32 Seiten
Intro SDL Suite
gsmman2006
Noch keine Bewertungen
The Inside Story On Shared Libraries and Dynamic Loading
Dokument8 Seiten
The Inside Story On Shared Libraries and Dynamic Loading
prabhakar_n1
Noch keine Bewertungen
Ricoh 010 b230 - b237 - d042 Troubleshooting Guide
Dokument19 Seiten
Ricoh 010 b230 - b237 - d042 Troubleshooting Guide
dwina roche
Noch keine Bewertungen