Recent changes
Table of Contents
·FAQ

Tuples

Tuples are a grouping of values (components), kind of like C-style structs. In a struct, each component is given a unique name. In a tuple, there are no names; a component is identified solely based on its position in the tuple.

// Tuple type
(int, String, bool)

// Sort-of-but-not-quite-equivalent struct type
struct {
   int v1
   String v2
   bool v3
}

Usually, though, you don’t access tuples like you access structs. You use parallel assignment syntax to compose and decompose tuples from and to individual elements.

typedef (int, String, bool) MyTuple

// Composing a tuple
MyTuple t = (1,"Hello",false)

// Decomposing a tuple
int i
String s
bool b
(i,s,b) = t

assert(i == 1)
assert(s == "Hello")
assert(b == false)

// A language might provide more convenient syntax to decompose
// a tuple into newly declared variables:
(int i, String s, bool b) = t

Function Parameters and Return Values

Does that last line remind you of anything? It looks a lot like a function parameter list in a C-style language. That’s because C-style function parameter lists are tuples. Some languages (like the functional programming languages Haskell and ML) let you use tuples all over the place. Most C-style languages only allow them as arguments to functions.

One way of looking at it is that a function always takes exactly one value and returns exactly one value. If you want to pass in more than one parameter, you have to use a tuple.

Take this C function signature, for example:

void jumpTo(int x, int y)

The jump function takes a single argument. That argument is a tuple of two ints. Get it? It’s kind of like having:

struct JumpParams {
   int x
   int y
}

void jump(JumpParams params)

Wouldn’t it be a pain if you were forced to do that for every function that accepts more than one parameter? Yes it would, which is why C uses tuple-like syntax to help you out. But what if you want to return more than one value? Let’s say the jump function also returned two float values (which represent the angle and distance that you need to jump to get to the requested location).

Since most C-style languages only let functions return one value, you have to use a workaround.

The most straightforward method is to wrap the return types in a separate struct:

struct JumpReturnVal {
   float angle
   float distance
}
JumpReturnVal jump(int x, int y)

This is not a good solution because you’re forced to do different things with function parameters and function return values. Why explicitly wrap return values in a separate struct if you don’t have to do that for function parameters?

You could also accept pointers as arguments into the function:

void jump(int x, int y, float *angle, float *distance)

// In C#, you can use "out" parameters
void jump(int x, int y, out float angle, out float distance)

The problem with the pointer method is that it’s not as straightforward. The convention is that a function’s output is sent out through the return value mechanism. With pointers and “out” parameters, the return values look kinda like inputs instead of outputs.

The C++ Standard Template Library provides a generic pair type:

pair<int,int> jump(float angle, float distance)

This solution is insufficient because it only works for functions that return two arguments. You could create more generic types for larger argument lists, but it still wont work as smoothly as function argument lists do. Another problem is that the mechanism for multiple parameters is still different from the mechanism for multiple return values.

With tuples, all you have to do is:

(float, float) jump(int x, int y) {
   float angle = cos(x/y)
   float distance = sqrt(x*x + y*y)
   return (angle, distance)
}

int x = 4
int y = 2
(float angle, float radians) = jump(x, y)

If you’re looking at the code above and thinking, “Man…that function signature doesn’t read well,” you’re right. It’s ugly. But that’s just because the C syntax for type definitions doesn’t work well with tuples. If you use another standard style (putting the type after the identifier), things look a little cleaner:

jump(int x, int y) : (float, float)
{
   /* ... */
}

Isn’t that better?

Tuples are a generalization of a concept programmers are already familiar with (through function argument lists). They should be added, swiftly and mercilessly, to every programming languge in existence. C# missed out on a great opportunity to add some value over Java. On the other hand, tuple syntax doesn’t really conflict too badly with the existing language grammar, so maybe there’s still a chance.

FAQ

The difference between tuples and lists

A list (or array) is a sequence of elements of the same type (aka a sequence of homogenous elements). A tuple is a fixed-size group of elements (like a C-style struct without explicit field names). The type of a tuple is more complicated (there is one type entry for each element in the tuple).

Here are examples of values and their types, separated by colons. (Parens are used for tuples, square brackets are used for lists.)

(1, 2, 3)  :  (int, int, int)
[1, 2, 3]  :  int[]

(1, "String", true, "Goose")  : (int, String, boolean, String)
[1, "String", true, "Goose"]  : Object[]

The last example may or may not cause a compile-time error, depending on whether the language will automatically use the common supertype of all of the list’s elements.

Singleton tuples

Singleton tuples (tuples with one element) do not make sense.

At this point, a Python user may feel the need to point out GvR’s explanation for why Python has singleton tuples. The difference is that Python tuples aren’t really tuples. They are immutable lists. I think they’re called tuples because they support parallel assignment, but that’s not what makes something a tuple. You need stronger typing to have real tuples.

Why don’t singleton tuples make sense? An N-tuple of integers is “N integers”. If you have a 3-tuple of integers, you have “three integers”. If you have a 1-tuple of integer, all you have is “one integer” (which is just an individual element, not at tuple).

Because of this, tuples are slightly different from records (C structs). Records are objects that contain other objects. A tuple is simply a collection of objects “glued” together – no container. I think the main problem is that the surrounding parens look like a container. If you think of the comma as the symbol that creates tuples (since the parens are just for generic expression grouping), it’s easy to see that singleton tuples don’t make sense. In OCaml, the “*” symbol is used to combine types to form tuple types (which is based on the traditional mathematical notation).

(int, int)           -->  int * int
(int, String, bool)  -->  int * String * bool
(int)                -->  int
int                  -->  int

It turns out to be fortunate that singleton tuples don’t make sense because it avoids the ambiguity caused by other uses of parenthesis: grouping of expressions. Otherwise, we’d have to decide how to deal with the expression (5+6). Is the expression’s type int or is it the singleton tuple (int)?

Empty tuples

Empty tuples don’t make sense either. Many languages, however, have syntatic sugar to make it seem like they exist. They let you write “()”, which looks like an empty tuple, but is actually a value with a special type (often called “unit”).

In some languages, the “unit” type is used in places where you might use “void” in C-style languages:

PrintNewline() : ()

C’s equivalent of “unit” is “void” (though C’s weak tuple support makes “void” seem like a special marker for functions that don’t return a value).

Event Multicast

When you combine tuples with generics, many possibilities open up. In C-style languages, tuples can only be used in function parameter lists, so you end up running into problems when you want to interact generically with functions.

One place where this problem appears is when you’re creating event multicast classes.

The worst possible way

In Java, an event multicast class for mouse click events might be implemented like this:

// Listener objects must implement this interface
interface EventListener
{
   void EventOccurred(int x, int y)
}

// This is the multicast class that listeners have to register themselves with.
class EventMulticaster
{
   List<EventListener> listeners = new List<EventListener>()
   
   // Listeners will call this to register themselves
   void AddListener(EventListener l)
   {
      listeners.add(l)
   }

   // The event source will call this method.
   void EventOccurred(int x, int y)
   {
      for (EventListener l : listeners) {
         l.EventOccurred(x, y)
      }
   }
}

Whenever the event source calls EventOccurred, all the registered listeners will be notified.

(Though C# delegates make event multicast a little easier, they don’t solve the fundamental problem. I’ll stick to plain old methods to keep things simple).

So what’s the problem? The problem is that this class will only work for events that use two int parameters to hold the relevant event information. This works fine for mouse clicks, but you have to duplicate the entire chunk of code for every set of parameters. I know what you’re thinking. You’re thinking that we should use parametrics types (a.k.a. generics) to abstract all that away.

The generic way

// Holds the context information about our event
class MouseEventInfo
{
   int x
   int y
   MouseEventInfo(int x, int y) { this.x = x; this.y = y }
}

interface EventListener<Param>
{
   void EventOccurred(Param p)
}

class EventMulticaster<Param>
{
   List<EventListener<Param>> listeners = new ArrayList()
   
   void AddListener(EventListener<Param> l)
   {
      listeners.add(l)
   }

   void EventOccurred(Param p)
   {
      for (EventListener<T> l : listeners)
         l.EventOccurred(p)
   }
}

To use it:

EventMulitcaster<MouseEventInfo> em = getMouseEventMulticaster()
em.EventOccurred(new MouseEventInfo(4, 6))

class MyMouseListener implements EventListener<MouseEventInfo>
{
   void EventOccured(MouseEventInfo params) { /* ... */ }
}

This works. But now we have to create a separate struct (like MouseEventInfo) for every event type. We need a way to create these on the fly.

The tuple way

Tuples eliminate the need to explicitly define container classes. You can create them on the fly.

EventMulticaster<(int,int)> em = getMouseEventMulticaster()
em.EventOccurred((4, 6))

class MyMouseListener implements EventListener<(int,int)>
{
   void EventOccurred((int, int) params)
   {
      int x
      int y
      (x, y) = params
   }

   // A language could allow special syntax in the argument list
   // to unwrap the tuple automatically:
   void EventOccurred((int x, int y)) { /* ... */ }
}

Note that we haven’t yet added any new features to the Java/C# style generics model. We’ve added tuples to the language (but other Java-style languages already have this; Nice, Nemerle). Those languages (I think) can already support the above implementation. To add this to C# or Java, you don’t have to change the generics semantics.

What? You don’t like the extra set of parentheses? Hmm…they are kinda ugly. We need a way to “inline” a tuple into its container.

Inlining Tuples

I think that a “..” suffix is intuitive syntax for inlining. I think that either Perl, Python or Ruby uses a prefix “*” operator to accomplish a similar thing.

class EventListener<Param>
{
   void EventOccurred(Param..)
}

class EventMulticaster<Param>
{
   List<EventListener<Param>> listeners = new List<EventListener<Param>>()
   
   void AddListener(EventListener<Param> l)
   {
      listeners.add(l)
   }

   void EventOccurred(Param.. p)
   {
      for (EventListener<T> l : listeners)
         l.EventOccurred(p..)
   }
}

Now you can write:

EventMulticaster<(int,int)> em = getMouseEventMulticaster()
em.EventOccurred(4, 6)

class MyMouseListener implements EventListener<(int,int)>
{
   void EventOccurred(int x, int y) { /* ... */ }
}

In this situation, the inlining is just syntatic sugar, but I think it’s worth it.

Inlining specifics

Generic type parameter list

If a generic type parameter declaration has a “..” suffix, then you don’t need parenthesis when instantiating it. For example, we could have defined the EventListener class like this:

class EventListener<Param..> {
   void EventOccurred(Param..)
}

// Now, to instantiate for an event with an (int, int) signature, we
// don't need the parens.
EventListener<int, int>

No big deal, just a little convenience.

To avoid ambiguity, you can only have one inlined type parameter in every parameter list.

Parameter lists

A function parameter can also be inlined. This feature was used in the event multicast example.

// Totally trivial use of inlining:
void GotoXY((int, int).. location) {
   /* ... */
}

// Equivalent to:
void GotoXY(int x, int y) {
   (int, int) location = (x, y)
   /* ... */
}

// Inlining is more useful with generic type parameters.
X = (int, int)
void EventOccurred(X..)
void EventOccurred(int, int)

DoSomething(int i, String s, boolean b) { /* ... */ }
DoSomething(123, "Hello", true)
var temp = (123, "Hello")
DoSomething(temp.., true)
var temp = (123, "Hello", true)
DoSomething(temp..)

Value expressions

The event multicast example also used inlining to expand a tuple value.

void EventOccurred(int x, int y)

// Invocations
EventOccurred(1, 2)
EventOccurred((1, 2)..)
(int, int) location = (1, 2)
EventOccurred(location..)

Other uses

This same syntax can be used to implement C#’s “delegate” feature cleanly. (That is, ignoring “out” parameters, which are stupid and should be shot on sight. Seriously though, tuple return values can easily replace them). Currently, delegates are implemented through back-end hacks in the CLR implementation.

data/tuples.txt Last modified: 01.10.2008 00:37
Driven by DokuWiki