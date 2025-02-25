Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In that aside in my last post, I commented that the reasons why I won’t be running R in production anymore were out of scope. This post is intended to explain those reasons in much more technical detail, to the extent that I’m capable.

Admittedly, this was a pretty inflammatory thing to say. I know there’s a whole community of R developers working on R in production, as well as lots of believers in R for production services. I know this, in part, because I’ve been one of them in the past. But more on that in a bit.

A couple weeks ago, I wrote a high-level post on REST APIs . One thing that I noted was that I couldn’t, in good faith, recommend running R (or Plumber, a common library used to create APIs in R) in any type of high-load production system.

I’ve argued with my fair share of people on the internet about R in production, and am aware of the usual pro-R arguments. I know about the Put R In Prod talks , and have used Plumber and RestRserve . I’m familiar with vetiver and the suite of MLOps tools that the Tidymodels team has been working on building out. In the past, I’ve referenced things like Put R in Prod as evidence that you can, in fact, run R in production. But I always felt a bit queasy about it: How was it, I’d ask myself, that I could really only find one reference of a company genuinely running R in production, when virtually every place that does machine learning that I’m aware of has experience with Python, Rust, Scala, or similar? This post is a long-form answer to that question.

I say this because I don’t want the rest of this post to seem as if it’s coming from someone parroting the same Python lines about how “it’s a general purpose programming language” and how “R is made for statisticians so it’s not meant for production” or any of the other usual arguments against R. My view is that most of these arguments are just people being dogmatic, and that most of those common criticisms of R are being leveled by people who have never actually worked in R.

First thing’s first: I love R. I’ve been a bit of an R evangelist for the past five years or so, and I think that R provides fantastic tooling that helps me do my day to day work dramatically (think maybe 3-4x) faster than equivalent tooling in Python, SQL, etc. I think I could argue strongly that the Tidyverse suite of tools has had a larger impact on how I write analytical code and how I think about data wrangling problems – in addition to just how I program in general – than any other single technical thing I’ve ever come across. In particular, purrr introduced me to functional programming and using functional patterns in the analytical code I write, and I haven’t looked back since.

This post is about high-load, online systems. You might think of this, roughly, as a system that’s getting, say, more than one request per second on average, at least five requests per second at peak times, and there’s a requirement that the service responds in something like 500 milliseconds (p95) with a p50 of maybe 100. For the rest of this post, that is the kind of system I’m describing when I say “production.”

Before I get into the guts of this post, I want to re-emphasize part of the tagline. When I say “production” in this post, I mean high-load production systems. I’m not talking about Shiny apps. I’m not talking about APIs getting one request every few seconds. I’m not talking about “offline” services where response times don’t particularly matter. I’ve had lots of success using R in all of those settings, and I think R is a great tool for solving problems in those spaces.

Problems

We’ve run into a number of problems with R in production. In broad strokes, the issues we’ve had have come from both Plumber, the API library we were using, and R itself. The next few sections cover some of the issues that caused the most headaches for us, and ultimately led us to switch over to FastAPI.

Gunicorn, Web Servers, and Concurrency First and foremost: R is single-threaded. This is one of the most common criticisms I hear about R running in production settings, especially in the Python vs.R for production discussions I’ve been in. Of course, those discussions tend to ignore that Python also runs single-threaded, but I digress. This post will be a bit more technical than some of my others. Since it’s already going to be long, I won’t be doing as much explaining the meanings of terms like “single-threaded” or similar. R running single-threaded and not managing concurrency particularly well isn’t a problem in and of itself. Other languages (Python, Ruby, etc.) that are very often used in production systems of all sizes have the same issue. The problem with R in particular is that unlike Python, which has Gunicorn, Uvicorn, and other web server implementations, and Ruby, which has Puma and similar, R has no widely-used web server to help it run concurrently. In practice, this means that if you, for instance, were to run a FastAPI service in production, you’d generally have a “leader” that delegates “work” (processing requests) to workers. Gunicorn or Uvicorn would handle this for you. This would mean that your service can handle as many concurrent requests as you have workers without being blocked. As I mentioned, R has no equivalent web server implementation, which, in combination with running single-threaded, means that a Plumber service really can only handle one request at a time before getting blocked. In my view, this makes running high-load production services in R a non-starter, as concurrency and throughput are the ultimate source of lots of scalability problems in APIs. Yes, Plumber does indeed integrate with future and promises to allow for some async behavior, but my view is that it’s hard to make an argument that async Plumber is a viable substitute for a genuinely concurrent web server. But let’s put aside the “non-starter” bit for a second, and let’s imagine that you, like me, want to try everything in your power to get R working in production. The following sections will cover other issues we’ve run into, and a number of workarounds we attempted, to varying degrees of success.

In my opinion, one of the biggest issues with R is the type system. R is dynamically typed, and primitive types are generally represented as length-one vectors. That's why these two variables are of the same type: class(1) [1] "numeric" class(c(1, 2)) [1] "numeric" This is a big problem. What happens when we try to serialize the number 1 to JSON? jsonlite::toJSON(1) [1] It returns [1] – as in: A length-one list, where the one element is the number one. Of course, you can set auto_unbox = TRUE , but that has other issues: jsonlite::toJSON(1, auto_unbox = TRUE) 1 This is fine, but the problem with auto_unbox = TRUE is that if you have a return type that is genuinely a list, it could sometimes return a list, and sometimes return a single number, depending on the length of the thing being returned: get_my_fake_endpoint <- function(x) { jsonlite::toJSON(x + 1, auto_unbox = TRUE)}get_my_fake_endpoint(1) 2 get_my_fake_endpoint(c(1, 2)) [2,3] In these two examples, I've gotten two different response types depending on the length of the input: One was a list, the other was an integer. This means that, without explicit handling of this edge case, your client has no guarantee of the type of the response it's going to get from the server, which will inevitably be a source of errors on the client side. In every other programming language that I'm aware of being used in production environments, this is not the case. For instance: import jsonimport sysx = 1y = [1, 2]print(type(x)) <class 'int'> print(type(y)) <class 'list'> json.dump(x, sys.stdout) 1 json.dump(y, sys.stdout) [1, 2] In Python, the number 1 is an integer type. The list [1, 2] is a list type. And the JSON library reflects that. No need for unboxing. But there's more! R (and Plumber) also do not enforce types of parameters to your API, as opposed to FastAPI, for instance, which does via the use of pydantic. That means that if you have a Plumber route that takes an integer parameter n and someone calls your route with ?n=foobar , you won't know about that until the rest of your code runs, at which point you might get an error about n being non-numeric. Here's an example: library(plumber)pr() %>% pr_get( "/types", function(n) { n * 2 } ) %>% pr_run() Obviously, n is indented to be a number. You can even define it as such in an annotation like this: #* @param n:int But R won't enforce that type declaration at runtime, which means you need to explicitly handle all of the possible cases where someone provides a value for n that is not of type int . For instance, if you call that service and provide n=foobar , you'd see the following in your logs (and the client would get back an unhelpful HTTP 500 error): <simpleError in n * 2: non-numeric argument to binary operator> If you do the equivalent in FastAPI, you'd have vastly different results: from fastapi import FastAPIapp = FastAPI()@app.get("/types")async def types(n: int) -> int: return n * 2 Running that API and making the following call returns a very nice error: curl "http://127.0.0.1:8000/types?n=foobar" | jq{ "detail": [ { "loc": [ "query", "n" ], "msg": "value is not a valid integer", "type": "type_error.integer" } ]} I didn't need to do any type checking. All I did was supply a type annotation, just like I could in Plumber, and FastAPI, via pydantic , did all the lifting for me. I provided foobar , which is not a valid integer, and I get a helpful error back saying that the value I provided for n is not a valid integer. FastAPI also returns an HTTP 422 error (the error code is configurable), which tells the client that they did something wrong, as opposed to the 500 that Plumber returns, indicating that something went wrong on the server side.

Clients and Testing Another issue with Plumber is that it doesn’t integrate nicely with any testing framework, at least that I’m aware of. In FastAPI, and every other web framework that I’m familiar with, there’s a built-in notion of a test client, which lets you “call” your endpoints as if you were an external client. In Plumber, we’ve needed to hack similar behavior together using testthat by spinning up the API in a background process, and then running a test suite against the local instance of the API we spun up, and then spinning down. This has worked fine, but it’s clunky and much harder to maintain than a genuine, out-of-the-box way to do testing that really should ship with the web framework. I’ve heard of callthat, but I’ve never actually tried it for solving this problem.

Performance When I’ve defended R in that past, I’ve also heard a common complaint about it’s speed. There are very often arguments that R is slow, full-stop. And that’s not true, or at least mostly not true. Especially relative to Python, you can write basically equally performant code in R as you can in numpy or similar. But some things in R are slow. For instance, let’s serialize some JSON: library(jsonlite)iris <- read.csv("fastapi-example/iris.csv")result <- microbenchmark::microbenchmark( tojson = {toJSON(iris)}, unit = "ms", times = 1000)paste("Mean runtime:", round(summary(result)$mean, 4), "milliseconds") [1] "Mean runtime: 0.7482 milliseconds" Now, let’s try the same in Python: from timeit import timeitimport pandas as pdiris = pd.read_csv("fastapi-example/iris.csv")N = 1000print( "Mean runtime:", round(1000 * timeit('iris.to_json(orient = "records")', globals = locals(), number = N) / N, 4), "milliseconds") Mean runtime: 0.1166 milliseconds In this particular case, Python’s JSON serialization runs 6-7x faster than R’s. And if you’re thinking “that’s only one millisecond, though!” you’d be right. But the general principle is important even if the magnitude of the issue in this particular case is not. JSON serialization is the kind of thing that you’re going to need to do if you’re building an API, and you generally want it to be as fast as possible to limit overhead. It also takes longer and longer as the JSON itself is more complicated. So while in this particular case we’re talking about microseconds of difference, the underlying issue is clear: Plumber uses jsonlite to serialize JSON under the hood, and jsonlite is nowhere near as fast as roughly equivalent Python JSON serialization for identical data and the same result. The takeaway here is that while it may be true that vectorized R code to create a feature for a model or low-level BLAS or LAPACK code that R calls to perform matrix multiplication should be equally performant to the equivalent Python, R can sometimes have overhead, like in JSON serialization, that becomes apparent as the size and complexity of both the body of the request as well as the response body scale up. There are certainly other examples of the same overhead. When we moved a Plumber service to FastAPI with no changes to the logic itself, we got about a 5x speedup in how long it took to process requests. And just to reiterate: That 5x speedup had nothing to do with changes to logic, models, or anything tangible about the code. All of that was exactly the same.