python - Assuming a many to many relationship model in sqlalchemy with mysql, how to update data ignoring duplicates? -
i little stuck sqlalchemy trying update data.
i have many many , one many relationship. first relationship between author , possible spellings of name. second linking authors written literature. paper may have several authors , vice versa.
assuming author "peter shaw" has 4 papers stored , linked him in database. no want "add" new set of 6 papers "peter shaw". unfortunately 4 of 6 papers stored in database. why session.commit()
results in duplicate error.
is there common way avoid duplicate errors, , tell sqlalchemy fill in holes instead of complaining duplicates? neither docus of sqlalchemy nor google enlighten me explicit answer/approach, suggestions apreciated.
these models testing with:
class namespelling(base): __tablename__ = 'name_spellings' id = column(integer, primary_key=true) name = column(string(255), nullable=false, unique=true, index=true) authors_id = column(integer, foreignkey('authors.id')) def __init__(self, name=none): self.name = name def __repr__(self): return "namespelling(%r)" % (self.name) class author(base): __tablename__ = 'authors' id = column(integer, primary_key=true) name = column(string(255), nullable=true, unique=true, index=true) papers = relationship('paper', secondary=author_paper, backref='authors') name_spellings = relationship(namespelling, order_by=namespelling.id, backref="author", cascade="all, delete, delete-orphan") def __init__(self, name=none): self.name = name def __repr__(self): return "authors(%r, %r)" % (self.name_spellings, self.name) class paper(base): __tablename__ = 'papers' id = column(integer, primary_key=true) title = column(string(1500), nullable=false, index=true) url = column(string(255), nullable=false, unique=true, index=true) date = column(date(), nullable=true) def __init__(self, title=none, url=none, date=none): self.title = title self.url = url self.date = date def __repr__(self): return "paper(%r)" % (self.title)
i have exact same problem sqlalchemy project. ended doing (and bad way of handling issue), check relationship collections before adding new instance session , replacing related instances result of session.merge(), if any.
it looks this:
def add_instance_to_session(instance, session): ''' add instance session, while checking existing child instances in relationship collection instance.child_list. ''' def _merge_and_replace(child): session.no_autoflush: merged_child = session.merge(child) if id(merged_child) != id(child): try: session.expunge(child) except sqlalchemy.exc.invalidrequesterror: # child wasn't in session begin pass return merged_child else: return child instance.child_list = map(_merge_and_replace, instance.child_list) session.add(instance)
this seems work me, comes across pretty bad performace-wise, if have many childs. maybe there better way utilizing on duplicate key idiom mysql offers, or similar constructs.
[edit] session.expunge() part unnecessary if above method used add instances session, children cannot in session @ point. @ least that's how think is...
Comments
Post a Comment