1.2 RStudio interface
Note: In this section I’ll start to occasionally refer to keyboard shortcuts. As a PC user, I will always show PC shortcuts. Hopefully, for Mac users it will be relatively easy to figure out what the appropriate Mac version is. Apologies in advance for any inconvenience.
Double check your RStudio version under Help > About RStudio
, and your R version under Tools > Global Options
. You will also always be able to check package versions under Tools > Check for package updates
. When you experience errors, we’ll want to be able to investigate version compatibility as a source of the error.
You can update individual/all packages using the Tools option noted previously at any time, but generally I wouldn’t necessarily recommend updating to the absolute latest version just because, since this can sometimes cause past scripts to no longer run. Just beware of this as a way to solve compatibility issues that seem to exist between two users.
RStudio generally has four “windows”:
- Source: Default top left. If you don’t already see this window, go ahead and open up a new .Rmd file at
File > New > R Markdown
and you’ll see some template code sitting in an “Untitled” document in the Source window (we’ll talk about how to read a .Rmd document soon). The Source generally will have multiple tabs of either code documents you’re editing or environment data you’re viewing. You generally spend most of your time working here. - Console: Default bottom left. This is where you actually “run” code. You can type lines of code directly into the console and run them, or you can execute lines of code from the Source window which basically “sends them” to the Console.
- Environment, etc.: Default top right. As you execute code you’ll end up creating “data” (generally done with a line of code that has the form
data_variable_name <- functions
), and any data that’s been created will sit in the “Environment”, and you’ll see individual objects listed here. You’ll be able to refer to objects in the Environment as part of subsequent lines of code. To remove objects, you generally will typerm(data_variable_name)
into the Console, or you can clear the whole Environment using the little broom icon. You rarely will use the other tabs in this window. - Files, etc.: Default bottom right. This should basically look like a view into some folder on your computer. You’ll always have a “working directory” where your R code by default “looks for” objects to load from, or export outputs into. So generally you’ll want to view your working directory here. We’ll discuss this more soon. The other tabs in this window are useful, but generally they’ll automatically show up depending on specific code you execute, so we’ll talk about them later.
I personally prefer a different layout than the default. You can adjust layout under View > Panes > Pane Layout
. I flip the Console and Environment windows, or, on a larger screen, add a third column so Source and Environment have their own columns. It’s also good to start practicing the shortcuts to temporarily zoom into one of these windows depending on what you’re doing: Try double-tapping Ctrl+Shift+1
, Ctrl+Shift+2
, Ctrl+Shift+8
, or Ctrl+Shift+9
to see what I mean.
Typically the first thing I do when beginning to code is set the working directory for whatever I’m working on. In the Files window, navigate to your desired folder (usually the cloned GitHub repo where your code is located; more on that soon), then click the gear button More > Set as working directory
. If you try this, you’ll see the written version of this step pop up in the Console, but I generally find it easier to set working directories using the user-friendly buttons.
Next, in the script window itself, the white gear wheel has an option to show chunk output “inline” or “in Console”. By default it’s “inline”, but I almost always switch to “in Console”. This is ultimately up to personal preference, but I like to always have code output show up in one place, the Console (or, if it’s a plot/map, in the File window).
You can easily try out the Console by typing random math equations and clicking enter. Conversely, note that in line 19 of your standard .Rmd template you should see summary(cars)
. If you put your cursor anywhere on that line of text and click Ctrl+Enter
, you should see that the Console essentially copied that one command and executed it. cars
happens to be a default dataframe (table) built into R, and summary()
tells you some general information about the dataframe. You can now also try typing summary(cars)
directly into the Console, and write a new line of code in line 19 like 1+1
and execute that using Ctrl+Enter
(make sure you write this between the pair of three backticks – we’ll explain soon – and that again your cursor is somewhere on the line you want to execute), to demonstrate to yourself that it’s all one and the same to the Console.
Some of the most common commands I’d run directly in the console are:
View(data_variable_name)
to view the dataframe that’s in the Environmentnames(data_variable_name)
to view the field names of the dataframenrow(data_variable_name)
to view the number of rows in the dataframemapview(data_variable_name)
to quickly view a mappable object (you’ll need to have loaded the relevant library to do this, which we’ll explain later)
Otherwise, I’m generally running code “inline” by putting the blinking typing cursor on any part of a line and doing Ctrl+Enter
. If you only want to run part of a line (especially %>%
pipelines, explained later), then you highlight a specific self-contained section before typing Ctrl+Enter
. Ctrl+Shift+Enter
will run the entire chunk of code (which is between the pairs of three backticks).
Note that so far we’ve introduced one “function”, summary()
. It’s simplest to think of these as miniature “machines” that create some output based on some input(s) you feed it. You put the inputs inside the parentheses. Sometimes a function doesn’t need any inputs (we’ll encounter one soon); sometimes it needs multiple inputs, which are separated by commas. When there are multiple inputs, the function always has a “schema” that guides you in the correct order of inputs, and there’s always a way to explicitly tell the function what input you’re providing to fill what argument. You can generally see this explained if you search “r somefunction” on Google, or by typing ?somefunction
in the Console to get a quick tutorial in the Files window (though this isn’t always as useful as Googling the same thing). If you try ?summary
, you’ll see a list of arguments, and you’ll notice that cars
was an input we provided to fill the first default argument, “object”.
Other miscellaneous notes at this point:
- Observe what happens when you double click and triple click on code. Generally you get an entire “word” or “line”, which can help speed up certain kinds of copy/paste operations.
- Similarly, observe what happens when you hold
Ctrl
,Shift
,Ctrl+Shift
, orAlt
and move up/down/left/right through code. All of these combinations can be handy to speed up different kinds of tasks. - I use
Ctrl+PgUp/PgDn
for fast navigation through the document. - Comment out and uncomment lines in Source documents by putting your cursor anywhere on the line and typing
Ctrl+Shift+C
.
Generally keep in mind that most things you want to do in R and RStudio are “designed” to be easy, but the knowledge can’t easily be conveyed to you (beyond guided tutorials like these), so you often have to do the Googling yourself to seek out those efficiency gains. Don’t assume that something can’t be done, leaving you stuck doing something that seems difficult to you; assume somebody has solved the problem somewhere! This is of course also very true when it comes to the code itself.
If you need a deeper explanation of these and many other fundamental concepts we’ll skip over in this course, start here, then do more Googling.