Introduction to ragged tensors

Intro­duc­tion to ragged tensors

Beitrag teilen:

Intro­duc­tion

 
We review Tensorflow’s concept of ragged tensors, which were intro­duced at the end of 2018. We explain their basic struc­ture and why they are useful.
 

Problem state­ment

 

With the standar­diza­tion of tradi­tional machine learning problems, many models are very easy to imple­ment: read a table of features from a database, use pandas and numpy for prepro­ces­sing, build a model with one of the well-known libra­ries, and type model.fit(). In many cases – that’s it! Done!

But wait – what if the data does not come in tabular form and is irregular by nature? What if there are instances with varying dimen­sions? Consider a scenario for time series classi­fi­ca­tion and suppose we have a dataset consis­ting of four diffe­rent short time series:

Introduction to ragged tensors

As you can see, the series differ in both the number and time of the measu­re­ments. Since machine learning models typically require a fixed input size, it’s a bit more compli­cated to fit such data into our models.

There are a number of possi­bi­li­ties to handle this type of input; for example we could inter­po­late the series and take virtual measu­re­ments at the same timestamps for each series:

Introduction to ragged tensors

Here we take the values from timestamps 0, 2, 4, 6, 8, and 10 such that every series consists of 6 values. However, at this stage we already have to choose hyper­pa­ra­me­ters such as the type of inter­po­la­tion, how many values, etc. However, we cannot rely on the accuracy of the inter­po­la­tion, especi­ally for extra­po­lated values and values within large gaps between succes­sive measu­re­ments (see the orange and green series at time 10).

From the technical side, when we feed the data into a Tensor­Flow Keras model and do not want to use inter­po­la­tion techni­ques, a common practice is to pad the series, e.g. with zeros at the end. This is neces­sary because Tensor­Flow groups batches of data together which must have the same shape in every dimen­sion. A batch of the 4 series above would have the shape (4, 6) with 4 being the number of series (=batch dimen­sion) and 6 being the number of measu­re­ments per series.

However, the 6 arises from artifi­cial data, either inter­po­lated measu­re­ments or padding values. To overcome the uncer­tainty and the overhead of both these techni­ques, we can use ragged tensors to work with the original data.

Concept of ragged tensors

 

The concept of ragged tensors is surpri­singly easy after under­stan­ding the inten­tion behind them. Let’s stick with our above example with 4 time series. As you can see, the minimum number of measu­re­ments per series is 3, while the maximum is 5. With padding we would have to fill every series with zeros at the end (or sometimes at the begin­ning) to achieve a common length of 5.

In contrast, a ragged tensor consists of the conca­te­na­tion of all values from all series together with metadata speci­fying where to split the conca­te­na­tion into the indivi­dual series. Let’s define our dataframe df and then our ragged tensor rt:

time value
0 3
3 1
6 8
8 0
10 9
0 15
5 11
8 7
0 12
2 7
4 8
9 2
0 9
4 0
6 13
10 4
row_splits = [0, 5, 8, 12, 16]
rt = tf.RaggedTensor.from_row_splits(values=df.values, row_splits=row_splits)
rt

<tf.RaggedTensor [[[0, 3], [3, 1], [6, 8], [8, 0], [10, 9]], [[0, 15], [5, 11], [8, 7]], [[0, 12], [2, 7], [4, 8], [9, 2]], [[0, 9], [4, 0], [6, 13], [10, 4]]]>

As we can see, the row_splits array defines the indivi­dual series by speci­fying their startrow (inclu­sive) and endrow (exclu­sive).

That’s it. This is the really simple struc­ture of ragged tensors. As an alter­na­tive to speci­fying the row_splits we can also create the same ragged tensor with one of the follo­wing methods:

  • value_rowids: for every row in the conca­te­n­ated series we specify an id number which indexes the indivi­dual series:
value_rowids = [0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]
rt_1 = tf.RaggedTensor.from_value_rowids(values=df.values, value_rowids=value_rowids)
rt_1

<tf.RaggedTensor [[[0, 3], [3, 1], [6, 8], [8, 0], [10, 9]], [[0, 15], [5, 11], [8, 7]], [[0, 12], [2, 7], [4, 8], [9, 2]], [[0, 9], [4, 0], [6, 13], [10, 4]]]>
  • row_lengths: we state the length of every indivi­dual series:
row_lengths = [5, 3, 4, 4]
rt_2 = tf.RaggedTensor.from_row_lengths(values=df.values, row_lengths=row_lengths)
rt_2

<tf.RaggedTensor [[[0, 3], [3, 1], [6, 8], [8, 0], [10, 9]], [[0, 15], [5, 11], [8, 7]], [[0, 12], [2, 7], [4, 8], [9, 2]], [[0, 9], [4, 0], [6, 13], [10, 4]]]>
  • constant: we can define the ragged tensor as a “constant” by directly speci­fying a list of arrays:
rt_3 = tf.ragged.constant([df.loc[0:4, :].values, df.loc[5:7, :].values, df.loc[8:11, :].values, df.loc[12:15, :].values])
rt_3

<tf.RaggedTensor [[[0, 3], [3, 1], [6, 8], [8, 0], [10, 9]], [[0, 15], [5, 11], [8, 7]], [[0, 12], [2, 7], [4, 8], [9, 2]], [[0, 9], [4, 0], [6, 13], [10, 4]]]>

Intern­ally, it does not matter which method we choose to create a ragged tensor, the results are all equiva­lent. Next we’ll see how to perform mathe­ma­tical opera­tions on ragged tensors.

Working with ragged tensors

 

Tensor­Flow provides a very handy function to perform opera­tions on ragged tensors: tf.ragged.map_flat_values(op, *args, **kwargs). It does what the function name says – every ragged tensor in args is substi­tuted by its conca­te­n­ated (=flat) version, omitting the batch dimen­sion. In our example, this is the same as if we operate on the df.values directly. The only diffe­rence is that the output of the opera­tion is again a ragged tensor with the same metadata infor­ma­tion about where to split. Let’s consider an example where we compute the matrix product of the ragged tensor with a matrix m of shape (2, 5). Each indivi­dual series in our ragged tensor has shape (k, 2) where k corre­sponds to the number of measu­re­ments in the given series. Taking care to first casting to floats:

m = tf.random.uniform(shape=[2, 5])
print(m.shape)

(2, 5)

rt = tf.cast(rt, tf.float32)
result = tf.ragged.map_flat_values(tf.matmul, rt, m)
print(*(t.shape for t in result), sep='\n')

(5, 5)
(3, 5)
(4, 5)
(4, 5)
Perfect! The resul­ting ragged tensor has the same row splits as the input, but the inner dimen­sion changed from 2 to 5 because of the matrix multi­pli­ca­tion. We could do some more compli­cated opera­tions, for example if m is not a 2-dimen­sional matrix, but a 3-dimen­sional tensor:
m = tf.random.uniform(shape=[2, 5, 4])
print(m.shape)

(2, 5, 4)

rt = tf.cast(rt, tf.float32)
result = tf.ragged.map_flat_values(tf.einsum, "bi, ijk -> bjk", rt, m)
print(*(t.shape for t in result), sep='\n')

(5, 5, 4)
(3, 5, 4)
(4, 5, 4)
(4, 5, 4)

As expected, the batch dimen­sion b corre­sponds to the length of the indivi­dual series, while the other dimen­sions origi­nate from m. By the way, tf.einsum refers to the Einstein summa­tion conven­tion, which is extre­mely handy if we are working with higher dimen­sional tensors. Read more about it here.

One last thing, it is also very easy to perform aggre­ga­tions over ragged tensors. For example, if we want to know the colum­nwise sum, we can use reduc­tion functions for this:

tf.reduce_sum(rt, axis=1)

<tf.Tensor: shape=(4, 2), dtype=float32, numpy=
array([[27., 21.],
       [13., 33.],
       [15., 29.],
       [20., 26.]], dtype=float32)>

There exists many more opera­tions for ragged tensors which are listed here.

Conclu­sion

We learned about the struc­ture of Tensor­Flow ragged tensors and how to perform basic mathe­ma­tical opera­tions on them. They make it unneces­sary to apply unnatural prepro­ces­sing techni­ques like inter­po­la­tion or padding. This is especi­ally useful for irregular time series datasets, although there are many other appli­ca­tions. Imagine a dataset with images of various sizes – ragged tensors are even able to handle multiple ragged dimen­sions, perfect for that.

In a subse­quent post I will dive a bit deeper into how to work with ragged tensors as input types for a Keras model by treating the indivi­dual time series as sets and performing atten­tion directly on the ragged tensors. Stay tuned!

Picture of Torben Windler

Torben Windler

Projektanfrage

Vielen Dank für Ihr Interesse an den Leistungen von m²hycon. Wir freuen uns sehr, von Ihrem Projekt zu erfahren und legen großen Wert darauf, Sie ausführlich zu beraten.

Von Ihnen im Formular eingegebene Daten speichern und verwenden wir ausschließlich zur Bearbeitung Ihrer Anfrage. Ihre Daten werden verschlüsselt übermittelt. Wir verarbeiten Ihre personenbezogenen Daten im Einklang mit unserer Datenschutzerklärung.