Scatter plot by group in R

Sample data

The data used in this tutorial are the following random vectors. Copy and paste them in your console to run the examples below.

# Data
set.seed(34)
x <- runif(300)
y <- 5 * x ^ 2 + rnorm(length(x), sd = 2)
group <- ifelse(x < 0.4, "Group 1",
                 ifelse(x > 0.8, "Group 2",
                        "Group 3"))
# Some noise after defining the groups
x <- x + runif(length(x), -0.2, 0.2)

Scatter plot by group

If you have a grouping variable you can create a scatter plot by group passing the variable (as factor) to the col argument of the plot function, so each group will be displayed with a different color.

# Scatter plot
plot(x, y,
     pch = 19,
     col = factor(group))

# Legend
legend("topleft",
       legend = levels(factor(group)),
       pch = 19,
       col = factor(levels(factor(group))))

Scatter plot by group in base R

Note that internally the function will store the factors as integers (1 = "black", 2 = "red", 3 = "green", …).

Change the default colors

If you want to change the default colors you can create a vector of colors and pass them to the function as in the following block of code.

Give color to each class in R scatter plot

# Color selection
colors <- c("#FDAE61", # Orange
            "#D9EF8B", # Light green
            "#66BD63") # Darker green

# Scatter plot
plot(x, y,
     pch = 19,
     col = colors[factor(group)])

# Legend
legend("topleft",
       legend = c("Group 1", "Group 2", "Group 3"),
       pch = 19,
       col = colors)

However, the colors displayed in the graph doesn’t follow the order of your vector of colors, but the order of the levels of the factor (orange for group 1, light green for group 2 and dark green for group 3).

Reorder the colors of the groups

As the displayed color is based on the levels of the grouping variable you can reorder the levels to change the order of the colors as in the following example.

# Color selection
colors <- c("#FDAE61", # Orange
            "#D9EF8B", # Light green
            "#66BD63") # Darker green

# Reorder the factor levels
reordered_groups <- factor(group, levels = c("Group 2",
                                             "Group 1",
                                             "Group 3"))
# Scatter plot
plot(x, y,
     pch = 19,
     col = colors[reordered_groups])

# Legend
legend("topleft",
       legend = c("Group 1", "Group 2", "Group 3"),
       pch = 19,
       col = colors[factor(levels(reordered_groups))])

Reorder the colors of a scatter plot in R

Now the first color (orange) is for group 2, light green for group 1 and dark green for group 3.

Data Sketches

A journey of imagination, exploration, and beautiful data visualizations

Buy on Amazon
Fundamentals of Data Visualization

A Primer on Making Informative and Compelling Figures

Buy on Amazon
Better Data Visualizations

A Guide for Scholars, Researchers, and Wonks

Buy on Amazon

See also