Exporting CSV files in Rails

This article is an extension of the Railscast video on exporting data to a csv file, you can watch the video here

Needing to export data from a database to a readable format is a common situation. Luckily, Ruby version 1.9 now includes the CSV module in the standard library. This allows you to easily export data to a cross platform spreadsheet.

This post will run through four ways of using that module to export data in Rails.

First off we'll export all the data from a collection of rows from a table. Next we'll include a 1:1 relation in the exported spreadsheet. Then we'll move onto filtering the attributes included in the exported data. Because not every table needs to include created_at and updated_at.
This post should allow you to write your own exporting method that will work in whatever situation you need.

Setup

First off we're going to need to include the CSV module, in Rails you'll want to include this in config/application.rb, like so:

config/application.rb

require File.expand_path('../boot', __FILE__)

require 'csv'

require 'rails/all'

Next we'll need to create a route that points to our controller's export action.

get 'export', to: 'foo#export', as: :foo_export

Then we will add our export action to the foo_controller. We'll want to only respond to csv format requests by setting up the export action as so:

controllers/foo_controller.rb

def export
  @data = Foo.order(:created_at)
  respond_to do |format|
    format.html { redirect_to root_url }
    format.csv { send_data @data.to_csv }
  end
end

Now we can create links in our views to export the Foo table like this.

<%= link_to "Export Data", foo_export_path(format: "csv") %>

Clicking the 'Export Data' link will start a download of the csv file created by Foo#to_csv. The next section will go over creating that method.

Creating the CSV file for Foo

This is where the magic happens. This method will create a CSV file containing all of the rows in the Foo collection that #to_csv was called on. This allows you to order or filter the rows included in any way you would order the collection.

The #to_csv file will be created in models/foo.rb

def self.to_csv(options = {})
  CSV.generate(options) do |csv|
    csv.add_row column_names
    all.each do |foo|
      values = foo.attributes.values
      csv.add_row values
    end
  end
end

On the first line of the method we called CSV.generate, this will create the csv file that we are exporting. By manipulating the csv varible created in the block passed to it we will manipulate how the csv file is created.

First off we add in the title row to the csv file.

csv.add_row column_names

This uses column_names to pass in the names of the attributes that Foo has. We add this using .add_row to our csv file.
.add_row will split arrays up into columns.

Next up we iterate through each foo in the collection #to_csv was called on. We will then extract the values from the foo and add the array of values to csv in the exact same way we added column_names. A missing attribute will leave a blank column for that row in the spreadsheet, other attributes after it will appear in their correct column.

all.each do |foo|
  values = foo.attributes.values
  csv.add_row values
end

And that's all there is to create a basic #to_csv method that can be called on an activerecord collection of Foo objects. By adjusting how @data is created in export_controller#export you can filter which rows are included in the exported data.

The above function will export all data from a collection of rows. When exported our csv file will contain this:

id,name,email,age,created_at,updated_at
1,Jane,jane@example.com,31,2012-05-13 16:32:38 UTC,2012-05-13 16:32:38 UTC
2,Thomas,tom@example.com,36,2014-03-05 11:06:22 UTC,2014-03-05 11:06:22 UTC
3,Matthew,matt@example.com,27,2014-06-24 23:55:22 UTC,2014-06-24 23:55:22 UTC
4,Al,al@example.com,52,2014-10-18 09:19:29 UTC,2014-10-18 09:19:29 UTC

Which when opened as a spreadsheet will display like this:

One Table All Columns

Including the data from other tables

The above is great if you just want to export one table, but what if you want to include a 1:1 relation in the csv file?

In the example below we assume that Foo has a has_one relation with Bar. To include our foo's bar we'll create a new method called to_csv_with_bar like so:

def self.to_csv_with_bar(options = {})
  CSV.generate(options) do |csv|
    csv.add_row column_names + self.bar.column_names

    all.each do |foo|

      values = foo.attributes.values

      if foo.bar
        values += foo.bar.attributes.values
      end

      csv.add_row values
    end
  end
end

This is just a minor modification to our original #to_csv method.

First we're going to need to include the column names for both tables in our first row. Just a simple concatenation of #column names from our two tables.

csv.add_row column_names + self.bar.column_names

Further down we're going to need to add the attributes from foo's bar to our array of columns, values, that is going to be added to csv. We check for bar's existence to avoid 'method not found for class nil' errors.
We won't run into any problems if a bar in our database does not exist. The csv file will just omit those attributes from the row.

if foo.bar
  values += foo.bar.attributes.values
end

And that's all the changes we need to include bar in foo's export function. Our exported spreadsheet will contain this:

id,name,email,age,created_at,updated_at,id,name,created_at,updated_at
1,Jane,jane@example.com,31,2012-05-13 16:32:38 UTC,2012-05-13 16:32:38 UTC,1,Armer Holdings Inc.,2012-05-13 16:32:38 UTC,2012-05-13 16:32:38 UTC
2,Thomas,tom@example.com,36,2014-03-05 11:06:22 UTC,2014-03-05 11:06:22 UTC,2,Armer Holdings Inc.,2012-05-13 16:32:38 UTC,2012-05-13 16:32:38 UTC
3,Matthew,matt@example.com,27,2014-06-24 23:55:22 UTC,2014-06-24 23:55:22 UTC,3,Armer Holdings Inc.,2012-05-13 16:32:38 UTC,2012-05-13 16:32:38 UTC
4,Al,al@example.com,52,2014-10-18 09:19:29 UTC,2014-10-18 09:19:29 UTC,4,Joranium Industries,2014-10-18 09:19:29 UTC,2014-10-18 09:19:29 UTC

Which when opened as a spreadsheet will look like this:

Two Tables All Columns

Filtering a table to include only specific columns

The above solutions have one problem, we're including all the columns for the tables being exported. This could produce some really unneeded information, for example we may not want to have updated_at or created_at. So let's modify #to_csv to allow us to include only the columns we want.

def self.to_csv(attributes = column_names, options = {})

  CSV.generate(options) do |csv|
    csv.add_row attributes

    # Iterate through all the rows.
    all.each do |foo|

      values = foo.attributes.slice(*attributes).values

      csv.add_row values
    end
  end
end

We've added in an optional parameter called attributes, this is an array that includes the names of all the columns we want to include in our exported csv file. We will provide attributes with a default value of column_names so that without it being sent in all columns will be included.

def self.to_csv(attributes = column_names, options = {})

Now instead of passing column_names into our first add_row we will pass in attributes, this will include only the columns we desire in the csv file's first row.

csv.add_row attributes

Next we use slice on foo.attributes to get only the columns we want from the hash returned.

values = foo.attributes.slice(*attributes).values

Then we just add values to csv using add_row and we've got our finished csv file with only the columns we want.

To use our new filtering system we call to_csv like so:

attributes_to_include = %w(name email age)

Foo.all.to_csv(attributes_to_include)

That will produce the following csv file:

name,email,age
Jane,jane@example.com,31
Thomas,tom@example.com,36
Matthew,matt@example.com,27
Al,al@example.com,52

which when loaded looks like this:

Single Table, Filtered Columns

The great thing about this approach is that the filtering is optional and can be adjusted to include exactly the columns you want from the table being exported.

Combining both filtered columns and multiple tables.

We can combine both filtered columns with multiple tables to combine only the wanted columns from both foo and bar. An example of how that could be done is below.

def self.to_csv(foo_attributes = column_names, bar_attributes = bar.column_names, options = {})

  CSV.generate(options) do |csv|
    csv.add_row foo_attributes + bar_attributes

    all.each do |foo|

      values = foo.attributes.slice(*foo_attributes).values

      if foo.contact_details
        values += foo.contact_details.attributes.slice(*bar_attributes).values
      end

      csv.add_row values
    end
  end
end

Thanks for reading, if you have any questions you can shoot me an email.

Thanks for subscribing!
Like what I'm doing? Subscribe and I'll let you know when I write new stuff.
Subscribe