Read the Documentation

A blog about learning new things. Browse all posts.

Blog tags with Flask and Flask-SQLAlchemy


Much of the structure of this blog came from the excellent blog series by Miguel Grinberg, The Flask Mega-Tutorial. However, while that series focused on creating a "microblog", a Twitter-like app that allows multiple users to post short entries, my goal was to implement more of a traditional blog. One common feature of many blogs, especially those designed to be searchable for useful information, is the tagging of posts under relevant categories related to the blog post.

To implement this feature, I needed to:

  1. Add a Tag SQLAlchemy model to represent an individual tag and a tags helper table to facilitate the many-to-many relationship.
  2. List the tags associated with each post as clickable links to search for posts with this tag
  3. Create a new view for displaying the posts associated with the tag

Many-to-many relationships with Flask-SQLAlchemy

Luckily for me, there is an excellent example of creating a many-to-many relationship (in fact, specifically tags for blog posts!) on the official Flask-SQLAlchemy documentation. I mostly used the example code verbatim.

tags = db.Table('tags',
    db.Column('tag_id', db.Integer, db.ForeignKey('tag.id'), primary_key=True),
    db.Column('post_id', db.Integer, db.ForeignKey('post.id'), primary_key=True)
)

class Post(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    ....
    tags = db.relationship('Tag', secondary=tags, lazy='subquery', backref=db.backref('posts', lazy=True))
    ....

class Tag(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.String(64), index=True, unique=True, nullable=False)

    def __eq__(self, other):
        if isinstance(other, Tag):
            return self.text == other.text
        elif isinstance(other, str):
            return self.text == other
        else:
            raise TypeError('equality comparison must be with type str or Tag')

    def __str__(self):
        return self.text

    def __repr__(self):
        return f'<Tag {self.text}>'

I made a few additions to the Tag model. I added a text field to Tag since there seemed to be no other obvious way to store the tag's actual text. I also implemented a custom equality comparison to allow comparing Tags to strings, which will allow me to do something like this later:

>>> tags = Tag.query.all()
>>> 'flask' in tags
True
>>> 'flaskk' in tags
False

Very helpful. I also overrode the __str__ method for this model to more naturally get the text of the tag. Now, it is fairly simple to create a new tag and associate it with an existing post manually by using flask shell. For example, to add a new 'flask' tag to the first post:

>>> flask_tag = Tag(text='flask')
>>> p = Post.query.first()
>>> p
<Post First post>
>>> p.tags
[]
>>> p.tags.append(flask_tag)
>>> p.tags
[<Tag flask>]
>>> flask_tag.posts
[<Post First post>]

SQLAlchemy exposes the many-to-many relationship created using db.relationship as a standard Python list, and thus supports standard list behaviors like append and remove to naturally interact with the models. Also, due to the back-reference backref=db.backref('posts', lazy=True) defined in the relationship, you can append/remove from Tag.posts and have those changes reflected in Post.tags.

>>> flask_tag.posts.remove(p)
>>> flask_tag.posts
[]
>>> p.tags
[]

This flexibility makes adding/remove tags simple and intuitive in the application logic. Since I don't want to manually add/update tags I needed to incorporate this logic into the creation and updating of posts.

Automatically adding and removing tags from posts

There are two distinct events that require adding or removing tags from a post: creating a new post and editing an existing post. However, taking advantage of the list-like nature of the many-to-many relationship SQLAlchemy provides, the two tasks become quite similar.

Creating a new post

When a post is first created, it doesn't yet have any tags associated with it. Initially, I took a more complicated approach to try minimizing DB calls, but the added complexity was causing issues and in practice most blog posts will only be tagged with a small number of tags. I opted instead for a simple loop through the list of tag strings, querying to see if it already exits, creating a new Tag and adding to the session if it doesn't, before appending the tag to post.tags.

for tag_str in tags:
    tag = Tag.query.filter_by(text=tag_str).first()
    if not tag:
        tag = Tag(text=tag_str)
        db.session.add(tag)
    post.tags.append(tag)

db.session.add(post)
db.session.commit()

Updating an existing post

Here again I initially took an overly complicated approach to updating the tags on posts. No need, since post.tags (and equivalently tag.posts) acts like a Python list, it can simply be cleared and an approach similar to what was used to create a new post can be used to build it back up.

This approach makes more DB calls, but eliminated the chance of duplicate tags being created. Posts are updated infrequently enough that I'm not concerned by it.

post.tags.clear()
for tag_str in tags:
    tag = Tag.query.filter_by(text=tag_str).first()
    if not tag:
        tag = Tag(text=tag_str)
        db.session.add(tag)
    post.tags.append(tag)

db.session.commit()

Rendering tags as clickable labels

Once the models are properly set up, it is fairly trivial to iterate over the tags for a given post in a template. In this case I am using Bootstrap to format the tags as labels.

{% if post.tags -%}
    {% for tag in post.tags -%}
        <a class="label-link" href="{{ url_for('.tag_search', tag_text=tag.text) }}">
            <span class="label label-info">{{ tag }}</span>
        </a>
    {%- endfor %}
{%- endif %}

Searching for posts based on their tag

I decided to create a new endpoint for tag searches /blog/tag/<tag text> so the URL can be easily shareable. Additionally, I wanted to ensure the view would paginate properly as the number of posts increased, so a page=page_number query string is also allowed.

These decision introduced some complexity in the view function, but luckily Flask-SQLAlchemy's .paginate() method on BaseQuery objects will happily return a Pagination instance even if there are no records to paginate. A query for a non-existent tag returns a valid Pagination instance with an empty Pagination.items list. I decided to not worry about validating the given tag text, opting instead to ensure the app could handle an empty Pagination instance. This has the bonus side-effect of still allowing valid tags to be searched for, even if all of the posts associated with that tag are either non-public or have been removed. (Currently the app does not remove tags that are not associated with a post.)

@bp.route('/tag/<tag_text>')
def tag_search(tag_text):
    page = request.args.get('page', 1, type=int)
    posts = Post.query.\
        join(Post.tags).\
        filter(Tag.text == tag_text)
    if not current_user.is_authenticated:
        posts = posts.filter(Post.public == True)
    pager = posts.paginate(page, current_app.config['POSTS_PER_PAGE'], error_out=False)

    # Show last page if a non-existent page is requested
    if page > pager.pages:
        return redirect(url_for('.tag_search', tag_text=tag_text, page=pager.pages))

    return render_template('blog/tag_search.html', title=tag_text, posts=pager.items, pager=pager)

This view function is fairly straight-forward, though there were a few issues to resolve.

Public vs Non-public posts

Because posts can be created that are not yet public, I don't want to include non-public posts in the search results unless the user is logged in (in this app, only admins will ever be logged in). This was accomplished with a simple check on current_user.is_authenticated, and adding an additional filter to remove non-public posts. The SQLAlchemy query method returns another Query instance, only executing the stored SQL when a method like first() or all() is called on the query. This makes it easy to chain various Query methods together before actually issuing a DB call.

if not current_user.is_authenticated:
    posts = posts.filter(Post.public == True)

Requesting a page beyond the last page

The page number to pass to the .paginate() method comes from the URL query string, and as such can't be trusted to be accurate. page = request.args.get('page', 1, type=int) will already take care of non-numeric input, but what about numbers too large?

Flask-SQLAlchemy will gracefully handle this situation, though the result isn't quite what I want. Passing a page number beyond the last page to .paginate() will complete successfully and return a valid Pagination instance (the .items list will, of course, be empty). From a user perspective, though, this causes issues with the rendering of page links, since the Pagination.has_prev and Pagination.prev_num will contain valid data, even if the previous page will also be empty of items.

The straight-forward solution is to check that the query sting page number is not larger than the maximum number of pages returned by .paginate().

# Show last page if a non-existent page is requested
if page > pager.pages:
    return redirect(url_for('.tag_search', tag_text=tag_text, page=pager.pages))

Although this will cause a redirect, the benefit is that the resultant URL will include valid page number in the query string.