OpenReview
  • Overview
    • OpenReview Documentation
  • Reports
    • Conferences
      • OpenReview NeurIPS 2021 Summary Report
      • OpenReview ECCV 2020 Summary Report
  • Getting Started
    • Frequently Asked Questions
      • I accidentally withdrew a submission, what do I do?
      • How do I add a Program Chair to my venue?
      • When will I be able to withdraw my submission?
      • I want to delete my withdrawn or desk-rejected paper, what do I do?
      • An author of a submission cannot access their own paper, what is the problem?
      • What should I do if I find a vulnerability in OpenReview?
      • How can I report a bug or request a feature?
      • What is the difference between due date (duedate) and expiration date (expdate)?
      • Will Reviewers be notified of their Assignments?
      • What is the max file size for uploads?
      • Why are the "rating" and "confidence" fields in my PC Console wrong?
      • What should I do if my question is not answered here?
      • My Profile is "Limited". What does that mean?
      • What field types are supported in the forms?
      • How do I recruit reviewers?
      • How do I obtain a letter of proof for my services as a reviewer?
      • How do I complete my tasks?
      • Can I automatically transfer my Expertise Selection to another venue?
      • Why does it take two weeks to moderate my profile?
      • What do the different 'status' values mean in the message logs?
      • I am an Independent Researcher, how do I sign up?
      • How do I locate the date a submission is made public?
      • I am a reviewer but I can't access my assigned submissions, what do I do?
      • Reviewers for my venue cannot see their assigned submissions, what should I do?
      • I am a reviewer and I don't have papers for Expertise Selection, what do I do?
      • How do I upload a publication with a license that is not listed?
      • I didn't receive a password reset email, what do I do?
      • How do I add/change an author of my submission after the deadline?
      • How do I find a venue id?
      • Why can't I update my DBLP link?
    • Using the API
      • Installing and Instantiating the Python client
      • Groups
    • Hosting a venue on OpenReview
      • Creating your Venue Instance
      • Navigating your Venue Pages
      • Customizing your submission form
      • Enabling Supplementary Material Upload
      • Changing your submission deadline
      • Enabling an Abstract Registration Deadline
    • Creating an OpenReview Profile
      • Signing up for OpenReview
      • Resending an activation link
      • Expediting Profile Activation
      • Add or remove a name from your profile
      • Add or remove an email address from your profile
      • Finding your profile ID
      • Entering Institutional Data
      • Importing papers from DBLP
      • Manually adding a publication to your profile
      • Finding and adding a Semantic Scholar URL to your profile
      • Finding and adding your ACL Anthology URL to your profile
      • Merging Profiles
    • Customizing Forms
    • Using the New Forum Page
    • Live Chat on the Forum Page
  • Workflows
    • Example Workflow
    • ARR Commitment Venues
    • Exercises for workflow chairs
      • Prerequisites
      • Exercise: Posting LLM generated reviews
  • How-To Guides
    • Modifying Venue Homepages
      • How to customize your venue homepage
      • How to modify the homepage layout to show decision tabs
    • Managing Groups
      • How to Recruit and Remind Recruited Reviewers
      • How to have multiple Reviewer or Area Chair groups
      • How to Add and Remove Members from a Group
      • Publication Chairs
      • How to Copy Members from One Group to Another
    • Workflow
      • How to Programmatically Post Support Request Form
      • How to test your venue workflow
      • How to Post a Test Submission
      • How to support different tracks for a venue
      • How to Make Submissions Available Before the Submission Deadline
      • How to Change the Expiration Date of the Submission Invitation
      • Desk Reject Submissions that are Missing PDFs
      • How to begin the Review Stage while Submissions are Open
      • How to Change Who can Access Submissions After the Deadline
      • How to Enable Commenting on Submissions
      • How to Set a Custom Deadline for Withdrawals
      • How to Enable an Ethics Review Stage
      • How to Hide Submission Fields from Reviewers
      • How to modify the Review, Meta Review, and Decision Forms
      • How to release reviews
      • How to Enable the Rebuttal Period
      • How to Undo a Paper Withdrawal
      • How to enable Camera Ready Revision Upload for accepted papers
      • How to make papers public after decisions are made
      • How to enable bidding for Senior Area Chair Assignment
      • How to release the identities of authors of accepted papers only
      • How to enable the Review Revision Stage
    • Paper Matching and Assignment
      • How to Compute Conflicts Between Users
      • How to Post a Custom Conflict
      • How to create your own Conflict Policy
      • How to Bid on Submissions
      • How to add/remove bids programmatically
      • How to do manual assignments
      • How to do automatic assignments
        • How to setup paper matching by calculating affinity scores and conflicts
        • How to run a paper matching
        • How to modify the proposed assignments
        • How to deploy the proposed assignments
        • How to modify assignments after deployment
      • How to enable Reviewer Reassignment for Area Chairs
      • How to Sync Manual and Automatic Assignments
      • How to Compute Affinity Scores
      • How to Undo Deployed Assignments
      • How to Modify Reviewer Assignments as an Area Chair
      • How to Get all Assignments for a User
      • How to Update Custom Max Papers for Reviewers or ACs
      • How to Make Assignments using Subject Areas
    • Communication
      • How to send messages through the UI
      • How to customize emails sent through OpenReview
      • How to send messages with the python client
      • How to Send Decision Notifications Using the UI
      • How to view messages sent through OpenReview
      • How to email the authors of accepted submissions
      • How to get email adresses
    • Submissions, comments, reviews, and decisions
      • How to add formatting to reviews or comments
      • How to submit a Review Revision
      • How to add formulas or use mathematical notation
      • How to edit a submission after the deadline - Authors
      • How to upload paper decisions in bulk
      • How to hide/reveal fields
      • Update camera-ready PDFs after the deadline expires
    • Data Retrieval and Modification
      • How to check the API version of a venue
      • How to view Camera-Ready Revisions
      • How to Export all Submission Attachments
      • How to get custom submission and author export
      • How to add/remove fields from a submission
      • How to manually change the readers of a note
      • How to post/delete an Official Review using Python
      • How to Get Profiles and Their Relations
      • How to Get All the Reviews that I have written and their Corresponding Submissions
      • How to Get All Registration Notes
      • How to Get All Submissions
      • How to Get All Reviews
      • How to Export All Reviews into a CSV
      • How to get all Rebuttals
      • How to Get All Official Comments
      • How to Get All MetaReviews
      • How to Get All Decisions
      • How to Get All Venues
      • How to Retrieve Data for ACM Proceedings
      • How to Get Reviewer Ratings
  • Reference
    • API V1
      • OpenAPI definition
      • Entities
        • Edge
          • Fields
        • Note
          • Fields
        • Invitation
    • API V2
      • OpenAPI definition
      • Entities
        • Edge
          • Fields
        • Group
          • Fields
        • Note
          • Fields
        • Invitation
          • Types and Structure
          • Fields
          • Specifiers
          • Dollar Sign Notation
        • Edit
          • Fields
          • Inference
    • Stages
      • Revision
      • Registration Stage
      • Bid Stage
      • Review Stage
      • Rebuttal Stage
      • Meta Review Stage
      • Decision Stage
      • Comment Stage
      • Submission Revision Stage
      • Post Submission Stage
      • Post Decision Stage
      • Ethics Review Stage
    • Default Forms
      • Default Submission Form
      • Default Registration Form
      • Default Comment Form
      • Default Review Form
      • Default Rebuttal Form
      • Default Meta Review Form
      • Default Decision Form
      • Default Decision Notification
      • Default Ethics Review Form
    • OpenReview TeX
      • Common Issues with LaTeX Code Display
      • OpenReview TeX support
    • Mental Model on Blind Submissions and Revisions
Powered by GitBook
On this page
  • Get Submissions
  • Extract data

Was this helpful?

Export as PDF
  1. How-To Guides
  2. Data Retrieval and Modification

How to get custom submission and author export

PreviousHow to Export all Submission AttachmentsNextHow to add/remove fields from a submission

Last updated 10 days ago

Was this helpful?

While exports are available for submissions in the UI, if you want to create an export with information not available from the default export, you can use the Python client to do so.

Get Submissions

  1. If you have not done so, you will need to .

  2. that you want to export.

Extract data

There are a couple methods that you can use to extract information from the submissions. Which method you use depends on how many papers you have, how many fields you want to extract, and personal preference.

No matter the method you use, it is important to understand the structure of the data- both Notes (Submissions) and Profiles have nested dictionaries stored within the content property. Functionally, what that means in this case is:

  1. If you query for a field that doesn't exist, the code will exit with an error. Rather than getting the value directly, I recommend using the .get(<fieldname>,<null_value>) dictionary method. This will return the value of the field if it exists, and another value if the field doesn't exist, rather than giving an error. Typically, the null value should match the type of the expected output.

  2. For querying most fields in a submission/profile, you will need to look within the content property (see example in 3. below)

  3. For submissions, to get a value from the submission, you will need to use the format submission.content[<fieldname>].get(['value'],<null_value>]

  4. For profiles, there may be multiple nested items in the content. For example, a user with multiple past affiliations will have a list of dictionaries for their history, with the current affiliation listed first. Profiles may have different lengths for these values, which is important to keep in mind when extracting data.

Method 1: Loop

The simplest method is to loop through all submissions and extract the relevant information. Here we simply print the data, but you could also write it to a csv. Below is an example where the submission number, title, author names, and author current affiliation are extracted from the data.

for submission in submissions:
    print(submission.number, submission.content['title'].get(['value'],'')) 
    author_profiles = openreview.tools.get_profiles(client, submission.content['authorids'].get(['value'],'')
    for author_profile in author_profiles:
        print(author_profile.get_preferred_name(pretty=True), author_profile.content.get('history', [{}])[0])

Method 2: Table

The second method is to generate a DataFrame with all of the data in the submissions, then select the relevant fields from the table for extraction. This is helpful if you want many fields from the submission, or if you have many submissions. Below is an example where the submission number, title, author names, and author current affiliation are extracted.

import pandas

def extract_content_values(note):
    content = {k: v.get('value', None) for k, v in note.content.items()}
    content['number'] = note.number

df = pd.DataFrame([extract_content_values(note) for note in submissions])

#Subset for the relevant fields
subset_df = df[['number','title','authors','authorids']]

#export data
subset_df.to_csv('submission_information.csv', index=False)

If all you need is submission information, you can export the data here, or if you want to get more information about the authors, you can follow the directions below.

Getting Profile Information

Once the DataFrame is created, you will need to get the profiles for authors to get author information.

#get a list of all authors
all_author_ids = set(id for ids in subset_df['authorids'] for id in ids)

profiles = openreview.tools.get_profiles(client_v2,all_author_ids)

Because profiles are nested dictionaries, you need to flatten the dictionary to create the fields, then extract the content, similarly to how the submission content was extracted. The original profile information looks something like this:

{'active': True,
 'content': {'emails': ['name@university.edu'],
             'emailsConfirmed': ['name@university.edu'],
             'history': [{'end': None,
                          'institution': {'country': 'US',
                                          'domain': 'university.edu'},
                          'position': 'PhD Student',
                          'start': 2017}],
             'homepage': 'https://test.com',
             'names': [{'fullname': 'First Last',
                        'preferred': True,
                        'username': '~First_Last2'}],
             'preferredEmail': 'name@university.edu',
             'relations': []},
 'id': '~First_Last2',
 ...<other metacontent>...
 
 }

After flattening, it would look like this:

preferredEmail
homepage
emails_0
names_0_preferred
names_0_fullname
names_0_username
history_0_position
history_0_start
history_0_end
history_0_institution_country
history_0_institution_domain
emailsConfirmed_0
profile_id

0

name@university.edu

https://test.com

name@university.edu

True

First Last

~First_Last2

PhD Student

2017

None

US

university.edu

name@university.edu

~First_Last2

There will be multiple columns for some profile fields recording each of the entries, for example: names_0_preferred, names_0_fullname. Because profiles have different numbers of affiliations in their profile, some of these columns will be null for some profiles.

from collections.abc import MutableMapping

def flatten_dict(d, parent_key='', sep='_'):
    """
    Recursively flattens a dictionary, concatenating nested keys.
    """
    items = []
    for k, v in d.items():
        new_key = f"{parent_key}{sep}{k}" if parent_key else k
        if isinstance(v, MutableMapping):
            items.extend(flatten_dict(v, new_key, sep=sep).items())
        elif isinstance(v, list):
            for i, elem in enumerate(v):
                # Handle lists of dictionaries by adding an index
                if isinstance(elem, MutableMapping):
                    items.extend(flatten_dict(elem, f"{new_key}_{i}", sep=sep).items())
                else:
                    # Just add the element if it's not a dictionary
                    items.append((f"{new_key}_{i}", elem))
        else:
            items.append((new_key, v))
    return dict(items)

def extract_content(d):
    flattened = flatten_dict(d.content)
    content = {k: v for k, v in flattened.items()}
    content['profile_id'] =d.id
    return(content)


#Create a DataFrame with the flattened profile content + profile ID
profile_df = pd.DataFrame([extract_content(note) for note in profiles])

#extract the columns you want included in the data
relevant_columns = ['profile_id'] + [c for c in profile_df.columns if 'history_0' in c] 
profile_df_subset = profile_df[relevant_columns]



The output will have the structure in the example aboveNow that you have the two DataFrames , you can combine them in a variety of ways. Below are a few specific examples:

Example: Check if any authors have a particular trait

In this case, we are checking if any authors have the 'example.com' domain.

domain_to_check = 'example.com'

#create a row that checks for that domain across all columns in the data
profile_df['contains_domain'] = profile_df_subset.apply(lambda row: any(
    row.astype(str).str.contains(domain_to_check, case=False, na=False)
), axis=1)

#map to a column ['any_has_domain'] that is True if any author has that domain in their history
def map_has_domain(df_notes, df_profiles, id_col='authorids', profile_id_col='profile_id', domain_col='contains_domain'):
    # Build lookup: profile_id → has_domain (from df_profiles)
    has_domain_lookup = df_profiles.set_index(profile_id_col)[domain_col].to_dict()

    # For each list of author IDs, return True if any have has_domain = True
    def any_has_domain(id_list):
        return any(has_domain_lookup.get(pid, False) for pid in id_list)

    df_notes['any_has_domain'] = df_notes[id_col].apply(any_has_domain)
    return df_notes

mapped_df = map_has_domain(df, profile_df)

#subset for only submissions with that domain
affiliated_df = mapped_df.loc[df_notes['any_has_domain'] == True]
affiliated_df.shape

Example: Label profiles with submission number

This example will add the submission number to each profile in the profile DataFrame:

# Step 1: Explode the 'authorids' list into separate rows
df_exploded = subset_df.explode('authorids')

# Step 2: Merge df_profiles with the exploded df_submission
df_merged = pd.merge(profile_df_subset, df_exploded[['authorids','number']], left_on='profile_id', right_on='authorids', how='left')

# Optional: Drop the 'authorids' column if it's no longer needed
df_merged = df_merged.drop(columns='authorids')

Example: Add preferredEmails to submission DataFrame

This example will get a list of preferred emails for all authors listed for a submission:

# Step 1: Explode the 'authorids' list into separate rows
df_exploded = df.explode('authorids')

# Step 2: Merge with df_profiles to get PreferredEmail
df_merged = pd.merge(df_exploded, profile_df[['profile_id','preferredEmail']], left_on='authorids', right_on='profile_id', how='left')

# Step 3: Group by 'submission_number' and aggregate 'PreferredEmail' into a list
df_result = df_merged.groupby('number')['preferredEmail'].apply(list).reset_index()

# Optional: Rename the column to 'preferred_emails'
df_submission_with_emails = pd.merge(subset_df, df_result, on='number', how='left')

print(df_submission_with_emails)

install and instantiate the openreview-py client
Get the submissions