Modern Statistical Computing in R
Universidad Pompeu Fabra
Area of Study
Computer Info Systems
Taught In English
A basic course in statistics.
Recommended U.S. Semester Credits3
Recommended U.S. Quarter Units4
Hours & Credits
OverviewLanguage of Instruction: EnglishProfessor: Albert SatorraProfessor's Contact and Office Hours: Albert Satorra (email@example.com)Course Contact Hours: 45Recommended Credit: 5 ECTS creditsCourse Prerequisites: A basic course in statisticsLanguage Requirements: NoneCourse Description:Over the recent years, R has become the leading software tool for statistical computingand graphics. The software is greatly enhanced by numerous contributed packagessubmitted by users. The majority of computing in the leading applied statistical journalsis done in R, and it is used almost exclusively in some of the leading-edge applications,such as in genetics and data mining. The purpose of this course is to set a foundationfor full exploitation and creative use of the statistical language for computing andgraphics R.Much of the statistical methodology implemented in software packages is used in theform of a black box. This is advantageous for a user who is not interested in the detailsof the methods, but the result is often a second-rate application, because theimplementation, even if of high quality, is often meant for a different context, smalldetails in the setting of options are ignored or misunderstood, and the orientation in theoutput, formatted for general interest, is difficult.The course will introduce students to the syntax and inner workings of R, to becomeproficient in everyday computational tasks with datasets of all kinds, skilled inapplications of elementary statistical methods, with an emphasis on (initial) dataexploration and simple graphics. Focus will also be placed on opportunities to enhancethe learning experience in other statistical courses by illustrating and applying basicstatistical concepts in R.Learning Objectives:At the end of the course, students wll have learned-to use a fundamental tool for computing in the practice of quantitativeanalytical methods (the ?paper-and-pencil? tool of the 21st century), that can work for thesmall jobs (like a pocket calculator) as well as for the big jobs (complex statistical dataanalysis).-programming, data handling, transformations, subsetting, exploratory dataanalysis, probability distributions and simulations, regression and linear models,summarising data, how to handle large data sets, effective graphics.-modern concepts of statistics based on simulations and writing a report of aquantitative analysis.Course WorkloadThe course is divided into lectures, discussions, practice with portable computers, andtutoring. Students should be prepared to read between 50 to 150 pages per week.Methods of Instruction:The course includes both lectures and field studies. Two-hour classroom sessions arenormally divided into one-hour lecture and one-hour of practice in computing. Studentsare required to come with their own laptops.Method of AssessmentClass participation (15%) homework and mini project (20%) (the equivalent of theMidterm exam) the main function of which will be to prepare students for the mainproject (65%). This project will involve some computing in R and submission of a reportof up to 6 typed pages (not counting appendices). Students will select theirprojects by their own (upon approval of the instructor) and will make a brief oralpresentation at the end of the course (the equivalent of the Final Exam).Class Participation: 15%Midterm Exam: 20%Final Exam: 65%1. General introduction to computingUsing R as a calculatorNumbers, words and logicals; missing values (NA)Vectors and their attributes (names, length, type)System- and user-defined objectsAccessing data (data()). Data in the system and date outside the system(read.table, scan)2. First steps in graphicsThe basics of R syntaxThe R workspaceMatrices and listsSubsettingSystem-defined functions; the help systemErrors and warnings; coherence of the workspace3a. Data input and output; interface with other software packagesWriting your own code; R scriptGood programming practiceR syntax - further stepsThe parentheses and brackets3b. Exploratory data analysisRange, summary, mean, variance, median, sd, histogram, box plot, scatterplot4. Probability distributions. SimulationsRandom number generation Distributions, the practice of simulation5. Apply-type functions Compiling and applying functions DocumentationConditional statementsLoops and iterations6. Statistical functions in RStatistical inference, contingency tables, chi-square goodness of fit, regression,linear models, advanced modeling methods7. Graphics; beyond the basicsGraphics and tablesWorking with larger datasetsPrinciples of exploratory data analysis (big data analysis)8. Dataframes in RDefining your own classes and operations Models and methods in RCustomising the user's environmentUPF Study Abroad Program 2016Required Readings: Handout material will be posted on the web as the courseevolves.Recommended bibliography:Students are encouraged to consult the following sources on their own.Dalgaard, P. (2002), Introductory Statistics with R, SpringerDennis, B. (2013). The R Student Companion, Taylor & Francis GroupMatloff, N. (2011). The Art of R Programming: A Tour of Statistical Software Design,WilliamPhilip H. Pollock (2014). An R Companion to Political Analysis, CQ PressChihara, L. and Hesterberg, T. (2011), Mathematical statistics with resampling and R,WileyLander, J. P. (2014) R for Everyone: Advanced Analytics and Graphics, Addison-WesleyData & Analytics Series
Please note that there are no beginning level Spanish courses offered in this program.
Courses and course hours of instruction are subject to change.