Skip to main content
  1. Posts/

Nice preview of dataclasses in notebooks

·3 mins
Python
Pierre-Antoine Comby
Author
Pierre-Antoine Comby
PhD Student on Deep Learning for fMRI Imaging
Table of Contents

When I am working in jupyter notebooks, I really enjoy the nice visualisation of dataframes as html table, and wish I could make other objects have pretty representation as well.

This is quite easy actually. Ipython has some API to support that. In particular implementing a _repr_html_ is quite easy. I have came up with the following:

def _my_repr_html_(obj: Any) -> str:
    """
    Recursive HTML representation for dataclasses.

    This function generates an HTML table representation of a dataclass,
    including nested dataclasses.

    Parameters
    ----------
    obj: The dataclass instance.

    Returns
    -------
        str: An HTML table string representing the dataclass.
    """
    class_name = obj.__class__.__name__
    table_rows = [
        '<table style="border:1px solid lightgray;">'
        '<caption style="border:1px solid lightgray;">'
        f"<strong>{class_name}</strong></caption>"
    ]
    for field_name, field_value in obj.__dict__.items():
        # Recursively call _repr_html_ for nested dataclasses
        try:
            field_value_str = field_value._repr_html_()
        except AttributeError:
            field_value_str = repr(field_value)

        table_rows.append(
            f"<tr><td>{field_name}</td><td>{field_value_str}</td></tr>")
    table_rows.append("</table>")
    return "\n".join(table_rows)

And to integrate with dataclasses its as easy as:

from dataclasses import dataclass

@dataclass
class Human:
    name: str
    age: int

    _repr_html_ = _my_repr_html_
@dataclass
class Pet:
    name: str
    age: int
    owner: Human
    _repr_html_ = _my_repr_html_

# If we instantiate some object
joe = Human(name="joe", age=25)
rex = Pet(name="rex", age=2, owner=joe)
# their representation is good, but could be prettier
print(rex)

But using the _repr_html_ we get a nicer preview in form of a html table:

Some stuff can still be improved:

  • If we have several nesting level of objects dependencies, the table is very tall, it could be nice to use some horizontal space as well
  • We don’t leverage type annotations from the dataclasses (for what is worth, this _repr_html_ function could be use with regular object too!)

Let’s fix that.

def _my_repr_html_(obj: Any, vertical: bool = True) -> str:
    """
    Recursive HTML representation for dataclasses.

    This function generates an HTML table representation of a dataclass,
    including nested dataclasses.

    Parameters
    ----------
    obj: The dataclass instance.

    Returns
    -------
        str: An HTML table string representing the dataclass.
    """
    class_name = obj.__class__.__name__
    table_rows = [
        '<table style="border:1px solid lightgray;">'
        '<caption style="border:1px solid lightgray;">'
        f"<strong>{class_name}</strong></caption>"
    ]
    from typing import get_type_hints
    from dataclasses import fields

    resolved_hints = get_type_hints(obj)

    field_names = [f.name for f in fields(obj)]
    field_values = {name: getattr(obj, name) for name in field_names}
    resolved_field_types = {name: resolved_hints[name] for name in field_names}

    if vertical:  # switch between vertical and horizontal mode
        for field_name in field_names:
            # Recursively call _repr_html_ for nested dataclasses
            field_value = field_values[field_name]
            field_type = resolved_field_types[field_name].__name__
            try:
                field_value_str = field_value._repr_html_(vertical=not vertical)
            except AttributeError:
                field_value_str = repr(field_value)

            table_rows.append(
                f"<tr><td>{field_name} (<i>{field_type}</i>)</td><td>{field_value_str}</td></tr>"
            )
    else:
        table_rows.append(
            "<tr>"
            + "".join(
                [
                    f"<td>{field_name} (<i>{field_type}</i>)</td>"
                    for field_name, field_type in resolved_field_types.items()
                ]
            )
            + "</tr>"
        )
        values = []
        for field_value in field_values.values():
            # Recursively call _repr_html_ for nested dataclasses
            try:
                field_value_str = field_value._repr_html_(
                    vertical=not vertical
                )  # alternates orientation
            except AttributeError:
                field_value_str = repr(field_value)
            values.append(f"<td>{field_value_str}</td>")
        table_rows.append("<tr>" + "".join(values) + "</tr>")
    table_rows.append("</table>")
    return "\n".join(table_rows)

This gives the following output

Nice huh ?

Conclusion
#

The _repr_html_ can bring some nice eye-candy, with a minimum amount of code. More powerfull object visualisation exists, from graphviz to fully fledged UML, but you don’t get to have the interactiveness of this simple approach.