1. 参考Django官方文档 外键构建 https://docs.djangoproject.com/zh-hans/3.2/ref/models/fields/#django.db.models.ForeignKey
2. 构建模型
# models.py
from django.db import models
# Create your models here.
class Artist(models.Model):
name = models.CharField(max_length=10)
ISSN = models.CharField(max_length=20)
ImpactFactor = models.CharField(max_length=10)
class Album(models.Model):
artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
JournalName = models.CharField(max_length=20)
3. 无外键数据库数据批量上传 (此处只提供了关键代码)
from polls.models import Artist, Album
data = Artist.objects.all()
data.delete()
artist_batch_data = [ ]
issn_to_artist_id = {}
journal_set = set()
issn_set = set()
artist_id = 0
album_id = 0
with open(Journal_file) as f:
f.readline()
for line in f:
issn, journal_name, journal_abbr, impact_factor, h_index, altmetric = line.strip().split('\t')
if issn_to_artist_id.get(issn):
artist_id = issn_to_artist_id[issn]
else:
issn_to_artist_id[issn] = len(issn_to_artist_id)
artist_id = issn_to_artist_id[issn]
artist_batch_data.append(Artist(name=journal_name,
ISSN=issn,
ImpactFactor=impact_factor,
id=artist_id))
journal_set.add(journal_name)
issn_set.add(issn)
Artist.objects.bulk_create(artist_batch_data)
print(f'JournalInfo data size: {artist_id:,}')
- 有外键的数据库数据更新
data = Album.objects.all()
data.delete()
journal_list = list(journal_set)[:10]
issn_list = list(issn_set)[:3]
album_batch_data = [ ]
for idx, journal_name in enumerate(journal_list):
issn = issn_list[idx%3]
artist_id = issn_to_artist_id[issn]
album_batch_data.append(Album(artist_id=artist_id,
JournalName=journal_name))
Artist.objects.bulk_create(album_batch_data)
4. 后记
(1)该文章记录的是一对多数据库外键构建,及artist_id可重复
(2)migration自动构建的数据库如下