How to Upload Edges in Bulk
Prerequisites and Setup
Follow these instructions if you already have Edges ready to upload. For example, if you followed:
If you are doing a full score re-computation you should instead upload scores as a CSV file using setup_committee_matching()
. It also handles full conflict re-computation.
Have good internet connection. Posting a lot of edges takes time, so it's important that you can maintain connection while uploading.
Different methods to posting edges in bulk
1. Calling post_bulk_edges
post_bulk_edges
The simplest way to upload edges in bulk is to call post_bulk_edges
and pass all your edges:
openreview.tools.post_bulk_edges(client=client_v2, edges=edges)
This will post edges in batches of 50,000 edges. You can modify the batch size by passing the batch_size
argument, e.g. batch_size=10000
.
However, it can be easier to manage a lot of edges by posting them in bulk for each paper or for each user. This can also help you avoid posting duplicate edges for a head
/tail
combo, e.g. for scores or conflicts. See the methods below.
2. Posting edges in bulk for each paper
Use this if you computed scores or conflicts for a subset of papers against all members of a group. This will assume you have a list of edges called score_edges
ready to upload.
Map paper IDs to the list of edges for that paper.
For affinity scores and conflicts, the
head
of the edge is usually the paper ID. Refer to the invitation for the correct edge configuration, e.g.:https://openreview.net/invitation/edit?id=venue_id/role_name/-/Affinity_Score
.
Loop through submissions and post edges for that paper:
from collections import defaultdict
paper_id_to_edges = defaultdict(list)
for edge in score_edges:
paper_id_to_edges[edge.head].append(edge)
for sub in submissions:
print("Processing edges for paper number: ", sub.number, " ID: ", sub.id)
openreview.tools.post_bulk_edges(client=client_v2, edges=paper_id_to_edges[sub.id])
The difference between this and 1) is that you are posting them using a grouping. So if it fails and you need to re-upload, you can delete edges for just that paper and trigger the re-upload:
client.delete_edges(invitation=invitation_id, head=sub.id, soft_delete=True)
3. Posting edges in bulk for each user
Use this if you computed scores or conflicts for a subset of users against all papers. This will assume you have a list of edges called score_edges
ready to upload and list of the subset of users called user_list
.
Map user IDs to the list of edges for that user.
For affinity scores and conflicts, the
tail
of the edge is usually the profile ID of the user. Refer to the invitation for the correct edge configuration, e.g.:https://openreview.net/invitation/edit?id=venue_id/role_name/-/Affinity_Score
.
Loop through submissions and post edges for that user:
from collections import defaultdict
profile_id_to_edges = defaultdict(list)
for edge in score_edges:
profile_id_to_edges[edge.tail].append(edge)
user_list = ["~User1", "~User2" ...]
for user in user_list:
print("Processing edges for user: ", user)
openreview.tools.post_bulk_edges(client=client_v2, edges=profile_id_to_edges[user])
Similar to method #2, you are posting them using a grouping. So if it fails and you need to re-upload, you can delete edges for just that user and trigger the re-upload:
client.delete_edges(invitation=invitation_id, tail="~User1", soft_delete=True)
Last updated
Was this helpful?