Now that you have both R and Python installed, we can get started by taking a tour of our two different integrated development environments environments (IDEs) RStudio and Spyder.
In addition, I will also discuss a few topics superficially, so that we can get our feet wet:
- creating variables, and
- calling functions.
Go ahead and open up RStudio. It should look something like this
I changed my “Editor Theme” from the default to “Cobalt” because it’s easier on my eyes. If you are opening RStudio for the first time, you probably see a lot more white. You can play around with the theme, if you wish, after going to
Tools -> Global Options -> Appearance.
The console, which is located by default on the lower left panel, is the place that all of your code gets run. For short one-liners, you can type code directly into the console. Try typing the following code in there. Here we are making use of the
In R, functions are “first-class objects,” which means can refer to the name of a function without asking it to do anything. However, when we do want to use it, we put parentheses after the name. This is called calling the function or invoking the function. If a function call takes any arguments (aka inputs), then the programmer supplies them between the two parentheses. A function may return values to be subsequently used, or it may just produce a “side-effect” such as printing some text, displaying a chart, or read/writing information to an external data source.
During the semester, we will write more complicated code. Complicated code is usually written incrementally and stored in a text file called a script. Click
File -> New File -> R Script to create a new script. It should appear at the top left of the RStudio window (see Figure 1.1 ) . After that, copy and paste the following code into your script window.
This script will run five print statements, and then create a variable called
myName. The print statements are of no use to the computer and will not affect how the program runs. They just display messages to the human running the code.
The variable created on the last line is more important because it is used by the computer, and so it can affect how the program runs. The operator
<- is the assignment operator. It takes the character constant
"Taylor", which is on the right, and stores it under the name
myName. If we added lines to this program, we could refer to the variable
myName in subsequent calculations.
Save this file wherever you want on your hard drive. Call it
awesomeScript.R. Personally, I saved it to my desktop.
After we have a saved script, we can run it by sending all the lines of code over to the console. One way to do that is by clicking the
Source button at the top right of the script window (see Figure 1.1 ). Another way is that we can use R’s
source() function1. We can run the following code in the console.
The first line changes the working directory to
Desktop/. The working directory is the first place your program looks for files. You, dear reader, should change this line by replacing
Desktop/ to whichever folder you chose to save
awesomeScript.R in. If you would like to find out what your working directory is currently set to, you can use
Every computer has a different folder/directory structure–that is why it is highly recommended you refer to file locations as seldom as possible in your scripts. This makes your code more portable. When you send your file to someone else (e.g. your instructor or your boss), she will have to remove or change every mention of any directory. This is because those directories (probably) won’t exist on her machine.
The second line calls
source(). This function finds the script file and executes all the commands found in that file sequentially.
Deleting all saved variables, and then
source()ing your script can be a very good debugging strategy. You can remove all saved variables by running
rm(list=ls()). Don’t worry–the variables will come back as soon as you
source() your entire script again!
First, start by opening Anaconda Navigator. It should look something like this:
Recall that we will exclusively assume the use of Spyder in this textbook. Open that up now. It should look something like this:
It looks a lot like RStudio, right? The script window is still on the left hand side, but it takes up the whole height of the window this time. However, you will notice that the console window has moved. It’s over on the bottom right now.
Again, you might notice a lot more white when you open this for the first time. Just like last time, I changed my color scheme. You can change yours by going to
Tools -> Preferences and then exploring the options available under the
Try typing the following line of code into the console.
Already we have many similarities between our two languages. Both R and Python have a
print() function, and they both use the same symbol to start a comment:
#. Finally, they both define character/string constants with quotation marks In both languages, you can use either single or double quotes.
We will also show below that both languages share the same three ways to run scripts. Nice!
Let’s try writing our first Python script. R scripts end in
.R, while Python scripts end in
.py. Call this file
Notice that the assignment operator is different in Python. It’s an
Just like RStudio, Spyder has a button that runs the entire script from start to finish. It’s the green triangle button (see Figure 1.3 ).
You can also write code to run
awesomeScript.py. There are a few ways to do this, but here’s the easiest.
This is also pretty similar to the R code from before.
os.chdir() sets our working directory to the
runfile() runs all of the lines in our program, sequentially, from start to finish3.
The first line is new, though. We did not mention anything like this in R, yet. We will talk more about
importing modules in section 10.4. Suffice it to say that we imported the
os module to make the
chdir() function available to us.
Programming is not about memorization. Nobody can memorize, for example, every function and all of its arguments. So what do programmers do when they get stuck? The primary way is to find and read the documentation.
Getting help in R is easy. If you want to know more about a function, type into the console the name of the function with a leading question mark. For example,
?setwd. You can also use
help.search() to find out more about functions (e.g.
help(print)). Sometimes you will need to put quotation marks around the name of the function (e.g.
This will not open a separate web browser window, which is very convenient. If you are using RStudio, you have some extra benefits. Everything will look very pretty, and you can search through the text by typing phrases into the search bar in the “Help” window.
In Python, the question mark comes after the name of the function4 (e.g.
print?), and you can use
help(print) just as in R.
In Spyder, if you want the documentation to appear in the Help window (it looks prettier), then you can type the name of the function, and then
Cmd-i on a mac keyboard).
File paths look different on different operating systems. Mac and Linux machines tend to have forward slashes (i.e.
/), while Windows machines tend to use backslashes (i.e.
Depending on what kind of operating system is running your code, you will need to change the file paths. It is important for everyone writing R and Python code to understand how things work on both types of machines–just because you’re writing code on a Windows machine doesn’t mean that it won’t be run on a Mac, or vice versa.
The directory repeatedly mentioned in the code above was
/home/taylor/Desktop. This is a directory on my machine which is running Ubuntu Linux. The leading forward slash is the root directory. Inside that is the directory
home/, and inside that is
taylor/, and inside that is
Desktop/. If you are running MacOS, these file paths will look very similar. The folder
home/ will most likely be replaced with
On Windows, things are a bit different. For one, a full path starts with a drive (e.g.
C:). Second, there are backslashes (not forward slashes) to separate directory names (e.g
Unfortunately, backslashes are a special character in both R and Python (read section 3.9 to find out more about this). Whenever you type a
\, it will change the meaning of whatever comes after it. In other words,
\ is known as an escape character.
In both R and Python, the backslash character is used to start an “escape” sequence. You can see some examples in R by clicking here, and some examples in Python by clicking here. In Python it may also be used to allow long lines of code to take up more than one line in a text file.
The recommended way of handling this is to just use forward slashes instead. For example, if you are running Windows,
C:/Users/taylor/Desktop/myScript.R will work in R, and
C:/Users/taylor/Desktop/myScript.py will work in Python.
You may also use “raw string constants” (e.g.
r'C:\Users\taylor\my_file.txt' ). “Raw” means that
\ will be treated as a literal character instead of an escape character. Alternatively, you can “escape” the backslashes by replacing each single backslash with a double backslash. Please read section 3.9 for more details about these choices.
A third way is to tell R to run
awesomeScript.Rfrom the command line, but unfortunately, this will not be discussed in this text. ↩
You can use this symbol in R, too, but it is less common.↩
Python, like R, allows you to run scripts from the command line, but this will not be discussed in this text.↩