very cool functional techniques - Programmers Heaven

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# very cool functional techniques

Posts: 2,900Member
[b][red]This message was edited by Moderator at 2004-3-3 10:59:59[/red][/b][hr]
I just got David Mertz's excellent book, Text Processing In Python. Until now I had never really grokked the concepts of functional programming and some of python's more obscure features like lambda, reduce, map, and filter. Here is an example that is based on some samples from the earliest chapters of the book.

Let's say we have a list of strings, perhaps the lines from a file, and we want to perform the same actions on each of them. For simplicity, let's just say that we want to force upper case, reverse, and normalize all white space to a single space.

Upper case is easy. All strings have a .upper() method already. We could alternatively use the upper() method of the string module. This is also a good way to introduce the lambda expression. A lambda creates a function whose body consists of one statement and the value of that statement is the return value of the function. Here we can create our own upper function that can be applied to any string (or any object with a .upper() method, technically):

upper = lambda s: s.upper()

This creates a function object that accepts an argument named "s" and returns the value of s.upper(). This function is then assigned to the name "upper" in the local namespace. Using our upper function is straightforward:

>>> upper('foo')
'FOO'

Reversing a string is a little less elegant. Strings don't have a reverse() method. Lists do, however, so if we convert our string to a list we can reverse that and then convert the new list back to a string. This is too much to perform in a lambda, so a regular function will have to suffice:

[code]
def reverse(s):
l = list(s)
l.reverse()
return ''.join(l)
[/code]

All strings have a .join method that accepts a list and returns a new string where the list items are joined together using the string as the delimiter. Here we just use the empty string to join our reversed list of characters.

>>> reverse('foo')
'oof'

Now, to normalize whitespace, we want to change any amount of whitespace between words into a single space. Luckily, all strings have a .split() method that returns a list of the words with all whitespace removed. We can then use the .join() method of a space character to put the list back together again into a string. For brevity's sake and because lambda's are so neat, we can do it like this:

normalize = lambda s: ' '.join(s.split())

This one line, complex as it might be to understand if you're new to these ideas, is really a very powerful statement. It doesn't matter how much whitespace is between words:

[code]
>>> s = 'foo
bar
gee'
>>> normalize(s)
'foo bar gee'
[/code]

So now we've built up the tools we need to upper, reverse, and normalize strings. If we wanted to do this to every line of a file, for example, we could do something like this:

[code]
results = []
for line in file('foo.txt', 'r'):
results.append(upper(reverse(normalize(line))))
[/code]

That's a lot of nested calls. Wouldn't it be great if we could do all the transformations in a single function call? Sure we could def a function that does all the intermediate steps and then returns the final value, but that's not as much fun.

What we want is a way to combine function calls into a single function. Try this out:

compose = lambda f, g: lambda x, f=f, g=g: f(g(x))

Wow, that's kind of ugly, isn't it? Let's take it apart.

lambda x, f, g: f(g(x))

This creates a function object that accepts three arguments, the last two of which are presumed to be functions. It then returns the result of calling f() on the result of calling g() on the first argument x. An example might help illustrate:

>>> nestedcall = lambda x, f, g: f(g(x))
>>> nestedcall('foobar', upper, reverse)
'RABOOF'

See? We pass in a string and two of our string-manipulating functions and get back the result of both of those functions applied to the string.

lambda f, g: lambda x, f=f, g=g: f(g(x))

This should make more sense now. We're creating a function that takes two arguments, f and g. This function when called will return a function that takes three arguments, the last two of which are defaulted to f and g respectively. Assigning this function to the name "compose", we now have a function that will create a function that can apply any two functions to some value.

[code]
>>> compose = lambda f, g: lambda x, f=f, g=g: f(g(x))
>>> upper_reverse = compose(upper, reverse)
>>> upper_normalize = compose(upper, normalize)
>>> upper_reverse('foobar')
'RABOOF'
>>> upper_normalize('foo bar')
'FOO BAR'
[/code]

Now the question becomes, how do we compose more than two function calls into a single function call. We already have a compose function that combines two, so we can reuse that as needed. Let's turn to the reduce() function built in to python.

reduce() in it's simplest form takes a function and a sequence as arguments. It calls the function on the first two items of the sequence, then calls the function again on the result of the first call and the third item, etc etc until all items have been used. An example to help illustrate:

15

This is an easy way of performing something like:

This can be used with any function that accepts two arguments. And look, our compose function accepts two arguments. So now we have a way of taking any number of functions and combining them into a single function call.

[code]
>>> upper_reverse_normalize = reduce(compose, [upper, reverse, normalize])
>>> upper_reverse_normalize('foo bar gee')
'EEG RAB OOF'
[/code]

Remember that python functions can accept any number of arguments with a single accumulator argument. Using this with a lambda, we can create a general function that will combine other functions:

composemany = lambda *funcs: reduce(compose, funcs)

So here is how we can use composemany to create a useful function that represents the single concept of applying three transformations to a single input:

[code]
>>> upp_rev_norm = composemany(upper, reverse, normalize)
>>> upp_rev_norm('foo

bar
gee')
'EEG RAB OOF'
[/code]

Wow, if you ask me, this is really slick.

Now let's go a little bit further if you're still with me.

The map() built-in takes two arguments. The first is a function and the second is a list of values. map() returns a list containing the results of applying the given function to each element of the given list individually. For example:

[code]
>>> l = ['foo', 'foo bar', 'foo bar gee']
>>> map(upp_rev_norm, l)
['OOF', 'RAB OOF', 'EEG RAB OOF']
[/code]

If you don't know already, in later versions of Python (2.2 and above, I think), file objects can be treated as iterators (meaning they're basically just lists that return one line at a time). Prior to 2.2 you can use the readlines() method to return a list of lines. Remember when we did this?:

[code]
results = []
for line in file('foo.txt', 'r'):
results.append(upper(reverse(normalize(line))))
[/code]

The file() function returns a file object and the for loop iterates over each line because files can be treated like lists. Now remember that the map() function works by calling a function for each item in a list. Given a file, foo.txt, with the following contents:

[code]
foo
foo bar
foo bar gee
[/code]

We can process this file in a single statement:

results = map(upp_rev_norm, file('foo.txt', 'r'))

Now, the moment of truth...

>>> results
['OOF', 'RAB OOF', 'EEG RAB OOF']

Voila! Now we have the tools we need to be able to represent the processing of a text file in a single python statement, and if you formalize everything we've done here in a module, you'll see that very little code was required to do it. upper() and normalize() were both single-line lambdas. reverse() was a trivial def in a few lines. compose and composemany were both single-line lambdas, and thanks to the map() function, we no longer need a for loop or to explicitly declare a list variable.

Very cool, if you ask me.

Happy coding.

[size=5][italic][blue][RED]i[/RED]nfidel[/blue][/italic][/size]

[code]
\$ select * from users where clue > 0
no rows returned
[/code]

• Posts: 2,900Member
If you're lurking out there, let me know what you think of this stuff and how it compares to Perl.

[size=5][italic][blue][RED]i[/RED]nfidel[/blue][/italic][/size]

[code]
\$ select * from users where clue > 0
no rows returned
[/code]

• Posts: 2,914Member
Hi,

Functional programming. A field of great ideas ripe for harvesting into our everyday programming languages, but full of languages that are hard to write our everyday programs in.

I've programmed in a functional language called ML. I found that it allowed some very elegant representations of certain problems, and moreover it is a lot closer to mathematics than the languages we tend to work in. You've noticed how list operations are done by building a new list, based around taking the list one element at a time. This is often done through recursion, though iteration happens too and often leads to a more efficient solution in terms of memory consumption (we don't build up a call stack). Iteration doesn't exist as we know it, but is implemented by using tail calls, so it still looks recursion but isn't. Whatever the case, this method of working with lists allows us to use a form of mathematical induction on our programs - that is, we can mathematically proove that our programs are correct. It's an interesting concept, though doing it for anything other than a very small program gets very difficult. I'd certainly not claim to be an expert in the field of program proof, I've only just scratched the surface of this area.

You wanted me to talk about Perl, so I will. We have map, which takes the form:-

@newlist = map BLOCK @oldlist;

@ is the sigil for an array, and we won't argue about sigils being good or evil because I'm quite sure we've been there before. :-) We have filter to, apart form it's called grep...

@newlist = grep BLOCK @oldlist;

In map the BLOCK is expected to return the new value to be put in the new list; indeed you can use a sub there. It just has to return something. In grep BLOCK is a predicate, e.g. the return value is treated in boolean context.

In Perl you can hold a reference to a function or a block just like you hold a reference to any other data type. That also means you can create anonymous functions - ML has these. Functions can have an explicit return statement, or they can work like lambda's as described in Python. So I could do:-

[code]sub double { 2 * \$_[0]; }
# or
sub double { 2 * shift; }[/code]

And I have a function double that doubles any number I feed it with. Note: @_ is the array containing a sub's parameters, and shift (and many other list operators) operate on @_ by default if there is no list specified.

Let's reverse a list of, say, anything, functionally (though Perl has a reverse function we can just use).

[code]sub fReverse { @_ == 1 ? @_ : (pop, fReverse(@_)); }[/code]

As for normalize...

[code]sub normalize { join ' ', split(/s/, shift); }[/code]

Your compose function is fairly neat, and we can emulate that in Perl by returning a sub that calls all these functions one after the other. We pass in a list of references to the subs to call.

[code]sub compose {
my @subsToCall = @_;
return sub {
my @curData = @_;
while (@subsToCall) {
@curData = &{shift @subsToCall} (@curData);
}
return @curData;
}
}[/code]

So, here's a full example of some functional-ish stuff.

[code]sub fReverse { @_ == 1 ? @_ : (pop, fReverse(@_)); }
sub addOne { @_ == 0 ? () : ((shift) + 1, addOne(@_)); }
sub compose {
my @subsToCall = @_;
return sub {
my @curData = @_;
while (@subsToCall) {
@curData = &{shift @subsToCall} (@curData);
}
return @curData;
}
}

@newList = &\$addRev(1, 2, 3, 4);
print join(' ', @newList);
[/code]

That prints 5 4 3 2 as you'd expect.

We'll leave it there for now. :-)

Jonathan

###
for(74,117,115,116){\$::a.=chr};((\$_.='qwertyui')&&
(tr/yuiqwert/her anot/))for(\$::b);for(\$::c){\$_.=\$^X;
/(p.{2}l)/;\$_=\$1}\$::b=~/(..)\$/;print("\$::a\$::b \$::c hack\$1.");

• Posts: 2,900Member
: We'll leave it there for now. :-)

Interesting. Thanks for the input. I'll refrain from comments on sigils and cryptic syntax :-)

[size=5][italic][blue][RED]i[/RED]nfidel[/blue][/italic][/size]

[code]
\$ select * from users where clue > 0
no rows returned
[/code]

• Posts: 2,914Member
: : We'll leave it there for now. :-)
:
: Interesting. Thanks for the input.
Welcome. BTW, I mentioned ML without giving any URLs...well, here is a place to download a GPL'd ML interpreter:-
http://www.dina.kvl.dk/~sestoft/mosml.html

: I'll refrain from comments on sigils and cryptic syntax :-)
:
It's only cryptic if you don't understand it. :-)

I guess we'll never agree here...we've already been through the "I love curly braces in languages"..."Ugh, not for me" thing. It's interesting how different people prefer to work in different languages, though.

Jonathan

###
for(74,117,115,116){\$::a.=chr};((\$_.='qwertyui')&&
(tr/yuiqwert/her anot/))for(\$::b);for(\$::c){\$_.=\$^X;
/(p.{2}l)/;\$_=\$1}\$::b=~/(..)\$/;print("\$::a\$::b \$::c hack\$1.");

• Posts: 2,900Member
: : : We'll leave it there for now. :-)
: :
: : Interesting. Thanks for the input.
: Welcome. BTW, I mentioned ML without giving any URLs...well, here is a place to download a GPL'd ML interpreter:-
: http://www.dina.kvl.dk/~sestoft/mosml.html

Cool. I've heard of ML and some of the others, but haven't yet tried them out. The distinction between functional and imperative programming languages/styles is still kind of fuzzy to me. I tried Icon once, I'm not sure where it fits on this spectrum, but it was kind of hard to grok.

: : I'll refrain from comments on sigils and cryptic syntax :-)
: :
: It's only cryptic if you don't understand it. :-)
:
: I guess we'll never agree here...we've already been through the "I love curly braces in languages"..."Ugh, not for me" thing. It's interesting how different people prefer to work in different languages, though.

Curly braces are better than 'End If' statements, but indentation is better than braces. IMO, of course. There's an implementation of PSP (python server pages) that uses curly braces in the <% %> tags to delimit python blocks because HTML does not preserve whitespace strictly. I've often thought of doing some kind of XML-based python code generation but was never able to think of a way to handle the indentation stuff. Duh on me. Funny how I have grown so biased against braces that I couldn't even see their practicality in this situation.

[size=5][italic][blue][RED]i[/RED]nfidel[/blue][/italic][/size]

[code]
\$ select * from users where clue > 0
no rows returned
[/code]

• Posts: 2,914Member
: : : : We'll leave it there for now. :-)
: : :
: : : Interesting. Thanks for the input.
: : Welcome. BTW, I mentioned ML without giving any URLs...well, here is a place to download a GPL'd ML interpreter:-
: : http://www.dina.kvl.dk/~sestoft/mosml.html
:
: Cool. I've heard of ML and some of the others, but haven't yet tried them out. The distinction between functional and imperative programming languages/styles is still kind of fuzzy to me. I tried Icon once, I'm not sure where it fits on this spectrum, but it was kind of hard to grok.
:
Not heard of Icon. As for functional languages being hard to grok...yeah, I can go with that one. Some people say they just seem obvious to them and like them, though having experienced mostly imperative languages before it was somewhat different.

Imperative - you write your program by writing a series of instructions that are followed one after the other.

Functional - you write your program entirely around the results of calls to functions

Thing is that functional languages often are not that pure - they do allow you to go a bit imperative if you should need to. Certainly ML does anyway. You can also have things like inner functions (functions inside functions, scoped only inside that function) and curried functions (functions which you can evaluate partially, then use it in the partially evaluated form many times later).

: : : I'll refrain from comments on sigils and cryptic syntax :-)
: : :
: : It's only cryptic if you don't understand it. :-)
: :
: : I guess we'll never agree here...we've already been through the "I love curly braces in languages"..."Ugh, not for me" thing. It's interesting how different people prefer to work in different languages, though.
:
: Curly braces are better than 'End If' statements, but indentation is
: better than braces. IMO, of course.
I can imagine that indentation forces you into a much stricter layout, though. Not that this is always a bad thing. I can't say I dislike the idea of indentation, I'm just more comfortable with curly braces.

: There's an implementation of PSP (python server pages) that uses
: curly braces in the <% %> tags to delimit python blocks because HTML
: does not preserve whitespace strictly.
That's just what PHP, ASP etc do. There is something called Mason which lets you embed Perl in HTML and I'd imagine there's something similar in there.

: I've often thought of doing some kind of XML-based python code
: generation but was never able to think of a way to handle the
: indentation stuff. Duh on me. Funny how I have grown so biased
: against braces that I couldn't even see their practicality in this
: situation.
:
I've grown biased against VB, it's just practical in some situations. (But don't tell KDL I said that... ;-)

Jonathan

###
for(74,117,115,116){\$::a.=chr};((\$_.='qwertyui')&&
(tr/yuiqwert/her anot/))for(\$::b);for(\$::c){\$_.=\$^X;
/(p.{2}l)/;\$_=\$1}\$::b=~/(..)\$/;print("\$::a\$::b \$::c hack\$1.");