04 Jul 2009 @ 6:27 PM 

My recent blog essay about the Physical Symbol System Hypothesis got me thinking that it would be fun to revisit more ideas from the history of AI and make a series out of it.

Even though I was poor and even though I left without my doctorate, I remember my years in graduate school with fondness. Teaching classes was a rewarding way to pay the rent, but the best part was the (mostly) open atmosphere of intellectual curiosity with which academics viewed and participated in the flow of research agendas.

At the time, most people around me were thinking a lot about neural networks — not so much in the sense of brain imitation but more as complicated combinations of continuous nonlinear functions. And in other ways as well, the very air was heavy with the humid scent of nonlinearity — chaos, complexity, evolution, and so on. It felt (to me at least) as if a cross-disciplinary version of AI was on the brink of some important gestalt understanding, and it was exciting.

In 1992, John Koza’s huge book Genetic Programming came out, and I remember devouring it avidly. Who wouldn’t be thrilled at the grand vision of evolving computer programs?

Looking back on it, I was interested in the properties of new classes of computational substrates. In the language I am using now to muse about these topics, a substrate is a formal structure in which computations occur. It includes not only a specification for the computations themselves but also the means for their creation, modification, and execution. So I was in a frenzy of trying to see beyond the habutual AI substrates — “hand coded declarative logic-ish databases,” mostly.

Ideally I’m interested in something more specific — substrates for models — but I don’t yet have a clear idea about how to express the general requirements for characterizing models (that’s the long-term purpose behind all of the musing I’m doing lately). I know that some sort of intentional relation is required, because models are the way the universe reflects itself. So instead of computations, the substrates are frameworks for descriptions of things, which are computations of a special type.

In the meantime, it is useful to think of substrates in a well-known way: function computation. From this perspective, a substrate:

  • provides a “space” of computations (function implementations), which I’ll call “expressions” to use familiar terminology.
  • (implicitly) defines what functions can be computed. Some mechanism that is part of the substrate definition “evaluates” an expression to produce an output value. This is kind of what we mean whenever we talk about functions. Note that for Turing-equivalent substrates, there is a sense in which this is not meaningful (we can’t know exhaustively which functions are computable; the evaluation of some expressions may never halt).
  • Interesting biases and metrics, like:
    • a preference-ordering based on description length (related to the operations required to produce it from a “null” expression), commonly justified as a reflection of Occam’s Razor, or on information-theoretic grounds.
    • a preference-ordering based on resources required to evaluate the expression
    • given the expression editing operations, a distance measurement between expressions based on editing distance.

Now, consider the case where some external entity can measure the “quality” of an expression based on how closely the function it computes matches up to some “desired” function. In the context of modelling, the idea will be that the expression should match up to the part of the world being modelled along some desired dimensions. But we can also think of this as a “supervised learning” situtation.

Given the ability to measure the quality of an expression as described above, we are interested in finding the best of all expressions with respect to the evaluation mechanism (or maybe we might be satisfied to find “good” expressions).

If you are not already familiar with these ideas, that might end up sounding like gibberish so I’ll make up a quick example to illustrate. Here’s a graph of 8 arbitrarily-chosen points. Suppose we would like to find the line that is the “best approximation” of these points (a common sort of statistics task). One such line is drawn for illustration amongst the points. To fit this into our framework, we need to define the substrate — that is, the space of possibile expressions. In this case, the expressions are equations of lines: y = Ax + B, where we can vary A and B to produce different expressions.

We also need to define the measurement (that is, define “best approximation”). So let’s try to minimize the mean squared distance between the line and the points. This is a nice example for visualization because the variations allowed are just two continuous parameters, which lets us use those parameters as the axes of a graph, with the “height” being the mean squared error for that particular expression. Have a look at the resulting graph (I used Mathematica to develop this example, in case you’re curious).

This kind of thing is often called a “fitness landscape”, using the word “fitness” instead of “quality” because of the analagous relationship to evolution. As usually expressed in this context, the evolution of living creatures operates on a substrate whose expressions are the possible genotypes of an organism (actually it’s more complicated that that but it will do as an approximation), and the “fitness” is something like the number of successful children the organism produces.

The optimal expression in my simple example case is the lowest point on the graph, which tells us the best values for A and B and thus the equation for the best line.

But: How do we find this optimum point on the fitness landscape?

In some simple cases (like this one, actually), it might be possible to use analytic means to construct this optimal expression using a bunch of extra domain knowledge (inverting the math of the fitness function itself). But in the general case all we have is the measurement, so we become interested in how to search through the space of possible expressions to find the best one.

Now, this is kind of cool! To find good expressions on a substrate (effective creatures, good models of the world, efficient automobiles, great novels), all we have to do is search around amongst the possibilities.

Usually this is easier said than done, unfortunately. Most search spaces are HUGE.

  • First of all, dimensions of the expression space may have infinite options. Like in our simple example here, A and B could be real numbers and so have infinite possible values. And quite often the pure analytic optimum will be an irrational number so no non-analytic process can ever even find it. In practice, most substrates are discrete, though: for representational purposes we use floating point numbers so only a finite number of options exist. Also, often (as mentioned above) we only really care about “good” expressions instead of “perfectly optimal” ones. Still, there are a lot of different possible floating point numbers to search through.
  • Most actually-interesting substrates have very high dimensionality. If we are searching through the space of all english texts looking for Shakespeare-quality plays, just think of all the dimensions involved!

So, the obvious and simplest search techniques — picking expressions at random or enumerating them one by one in some straightforward way — may take a long time.

If the fitness landscape is nice and simple, like the above example, hill climbing is an effective way of searching. In general, though, fitness landscapes are not so friendly — they have false peaks, ridges, discontinuities, and features not so concisely described with facile geographic analogies. The general subject of search has been tied up with AI from the very beginning. Some people believe that the two things are basically just two ways of describing the same thing: for example, the AIXI formulation of theoretical AI essentially takes this viewpoint.

Getting back to Koza’s Genetic Programming, fitness landscapes for typical programming languages on typical programming problems are very ugly (by which I mean very uncorrelated in regions of interest). Programs are brittle — small changes made to high-fitness programs nearly always lead to disaster. The miracle is that it works at all!

In fact, it usually doesn’t work. Simulated evolution has not replaced or even significantly supplemented traditional software engineering methods.

Why does it work sometimes? That is a complicated question. The nature of the programming task, the details of the programming language, and the operators for editing programs figure heavily into the issue.

Let’s invent some shorthand and say that a substrate is evolvable with respect to a particular task if its fitness landscape has the desirable features: high local correlation, good continuity, and a relatively small number of unacceptable-quality false peaks. Evolvability is highly desirable if it’s important to be able to search for expressions. I care about this because I think it is very likely that search for good expressions is extremely important for intelligent systems (including systems engaged in automatic modelling), because empirical evidence strongly implies that complex problems resist analytical solutions.

Because of this, we will be well served by designing evolvable substrates — evolvable programming languages, evolvable network models, evolvable everything. Not that we actually want to build complex computer modelling systems by aping evolution, of course, but because more targeted and appropriate search methods will be helpful.

And that’s all I have to say.

Oh, wait, no it isn’t: living things are evolvable (duh, otherwise they wouldn’t have evolved). The biological substrate called “life” is exceptionally interesting. I believe that a series of reductionistically- impenetrable transformations in level are crucial to this process. By “reductionistically-impenetrable” I mean that the “higher-level” is neither simply describable in terms of the “lower level” entities, nor are the behavioral and structural characteristics of the “higher level” tractably predictible from the “lower level”. Consider:

  • Protein shape and function derive from genetic sequences, but are not readily redicible to those sequences.
  • Cellular structures and functions are constructed largely from proteins, but cell biology is not really reducible to protein shape.
  • More such emergent levels exist — organs not easily reducing to descriptions in terms of cells, developmental pathways not reducing to patterns of gene expression, etc.

I believe that the redescribability afforded by these impenetrable layers helps afford the opportunity for radical restructuring of fitness landscapes and thus the potential for evolvable organisms to arise from substrate modification operators that work on the genetic level.

Perhaps it will be fruitful to design evolvable modelling substrates that share this kind of level-decoupling in efforts to enhance the feasibility of search. Someday we’ll know!

Looking back on it, my 1992 excitement about reducing the mysteries of intelligence to a few key insights from nonlinear dynamics was naive. For one thing, those constellations of ideas — complexity, evolution, self-organized criticality, chaos, etc — turn out not to be very sharp scalpels for doing reductionist dissection. For another, intelligence itself appears to be a tougher problem than my youthful enthusiasm suspected.

No matter. Figuring out how to reflect itself is one of the advanced ongoing features of the universe; I’m satisfied to be a little piece trying pitch in, using whatever means make sense to me in the moment.

Oh, one more thing: Why is search so necessary to this enterprise? Wouldn’t it be much better to just construct a description instead of rooting around for one? Unfortunately, there appears (to me, today) to be no way to do such construction across the divide of “unconnected” levels of formalism. However, it is conceivable that there might be a “bottom-up” alternative, which perhaps I’ll write about in a future post.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 08 Jul 2009 @ 01 29 AM

E-mailPermalinkComments (0)

 27 Jun 2009 @ 2:15 PM 

Critics sometimes see the AI enterprise as merely the application of the latest tech fad — given a shiny new toy like a computer, we just try clumsily to make a metaphorical connection: mind is like a computer.

That is of course completely backwards: actually, computers are like the mind — and purposefully so. Arithmetic, algebra, logic, and algorithms of various sorts were dreamed up long before computers came along to automate them. Our ancestors invented them hundreds and thousands of years ago to provide quantitative models of bits of the world they cared about, usefully predicting and explaining things.

Since then, we’ve found other uses for computers — as media for communicating and presenting information, for example — but modelling the world and modelling mental processes have always been a major motivation behind the technology.

The sheer number-crunching capacity of computers makes it feasible to model the world with methods not directly derived from introspection — physics simulations and whatnot. And some methods start out with introspective inspiration but quickly get sharpened to a narrow point to such an extent that they no longer bear a close resemblance to their conceptual and linguistic origins.

Consider chess as an example. Starting from the introspective view that a person might think: if I do this then he’ll do that and then I can counter with such-and-so… This one little observation about a small part of the mind’s modelling of the game led to interesting techniques for tree search, and then got taken to an extreme… in this case culminating in Deep Blue and the defeat of Garry Kasparov.

When this narrowing and optimization of mind-inspired methods occurs, we typically say it is no longer AI, which actually seems appropriate to me.

Whether the source of computer models is introspection, physics, statistics, or ad-hoc empirical observation, modelling is the pervasive basis for much of the usefulness of computers. Consider a word processing program: much of its value derives from models of paper documents and other old structured comunication forms, models of language, models of printing machinery (printer driver interfaces), and so on.

Years ago I did some work for a friend of mine who owned a business that developed and sold a complex software package to help manage fleets of concrete trucks (the big ones with the spinning drums on them). Such a package is full of models, from models of general business activities like invoicing and accounting to models of delivering megatons of concrete on a customer’s desired schedule — which involves geography and roads, time required to load and unload the trucks, etc. These models make forecasting and resource allocation much more accurate.

These kinds of things are the meat and potatoes of computer applications. Web browsing, playing games, and presenting media like music and movies are probably bigger uses for most people these days, but when you scratch the surface you’ll frequently find modelling underneath: HTML is a document model, and game graphics are probably the clearest example of computer modelling you’ll ever see.

The first “real” computer, Eniac (1946), was built to calculate artillery firing tables (a pure modelling task). Today’s most powerful computer (the IBM Roadrunner, over a million gigaflops) is used primarily to simulate how nuclear materials age, modelling the safety and effectiveness of nuclear weapons (not the happiest purpose, but better than testing by exploding them).

All told, computer models comprise a hugely important industrial technology. And building them is hard, a labor-intensive and specialized hugely expensive process. The universe reflecting itself is not a trivial accomplishment! There are many clear economic benefits from: specific model construction, the invention of new modelling techniques, and improving the modelling process. Society can be served, and money made, by digging deeper into the roots of how it all works.

Our minds are really amazing modelling engines. So in a practical sense it isn’t far off to say that Artificial Intelligence == Automatic Modelling. That’s why — besides the interesting philosophical investigation of our own nature — many of us think AI is such a fascinating task. Not because we think “Wow, brains are like computers!” but because modelling modelling may be the key to the future.

I’ll end this post by listing a few computer modelling methods:

  • Differential equations and approximations thereof
  • Finite element analysis
  • Cellular automata
  • Function fitting
  • Fourier transforms
  • Neural networks
  • Simulated annealing
  • Genetic algorithms
  • Hidden Markov models
  • Bayes networks
  • Finite state machines
  • Computational algebra
  • Ad-hoc programmatic models
  • Object-oriented decomposition
  • Monte Carlo methods
  • Production systems
  • Logic of many flavors
  • L-systems and other generative models
  • Constructive solid geometry, NURBS, etc
  • Case-based reasoning
  • Control theory
Tags Categories: Uncategorized Posted By: Derek
Last Edit: 27 Jun 2009 @ 02 39 PM

E-mailPermalinkComments (3)

 20 Jun 2009 @ 1:15 PM 
 

PSSH

 

If, as is widely reported, AI researchers (those scurvy dogs) began barking up the wrong tree a long time ago, I wonder when exactly that was and what scent led them astray. They started the chase promisingly… I think a lot of the foundational work of the Old Masters was really quite brilliant.

Case in point: The Physical Symbol System Hypothesis.

If you’re inclined to speculate that a computer could in principle be “intelligent” (I am so inclined; if that turns out to be wrong, then, well), it would be helpful to say some more about what is meant by a “computer”. If the speculation is phrased as “the UNIVAC I on the third floor could be intelligent,” that seems overly specific. So the natural thing would be to express it using one of the (equivalent) abstractions for “computation”, such as the Turing Machine, and just say that intelligence is computable and leave it at that.

In 1976, Allen Newell and Herb Simon (Newell’s doctoral advisor) chose a different way of putting it, a more detailed hypothesis that emphasized what they considered to be some actual fundamental issues of cognition. They called their abstract computer a physical symbol system. So:

The PSSH: A physical symbol system has the necessary and sufficient means for general intelligent action.

To simplify a bit, I’ll take “physical” as given and (as a nuts’n'bolts engineering-type guy) I’ll leave off the “necessary” part. Thus: a symbol system has the sufficient means for general intelligent action. So what’s a symbol system? Let’s start with some of their definitions:

  • A symbol is a physical pattern that can occur as components of another type of entity called a symbol structure.
  • A symbol structure is composed of a number of instances (or tokens) of symbols related in some physical way.
  • In addition to symbol structures, a symbol system has a collection of processes that operate on symbol structures to produce other symbol structures. These processes account for the creation, modification, reproduction, and destruction of symbol structures.
  • Through the application of those processes, then, a symbol system is a machine that produces through time an evolving collection of symbol structures.

Although that seems straightforward enough, people view the world through their own set of biases, which produce a number of (in my opinion) erroneous views of the PSSH and, by extension, rather odd characterizations of the AI enterprise. For example:

Viewpoint 1: 0 and 1 are symbols and machine instructions are processes that manipulate structures of bits, so it sort of follows that the PSSH is really only saying the same thing as “computers could be intelligent.”

Although this is a comforting way to respond to critics who attack the AI enterprise by enumerating the inadequacies of particular symbol systems, it ignores one of the central concepts of the PSSH: designation (which is not part of the basic definitions given above but is nontheless part of the PSSH). According to Newell, a symbol structure designates a thing if the system can use the symbol structure to affect the thing itself or behave in ways dependent on the thing. That is, some sort of “access” to the object is provided by the symbol structure, and this relationship between symbols and the universe is critical, although it is of course limited (the map is not the territory, after all…). This gives rise to the usual way of thinking about symbols as “standing for” things.

Viewpoint 2: Symbols are atomic tokens such as hotdog. Although much work in AI has proceeded from this kind of assumption, it leads to significant difficulties and conundrums. Specifically, if hotdog is just a label floating around, where does the designation to an actual hotdog come from? This is one trivial reflection of the famous symbol grounding problem, which is a very subtle but crucial challenge to the “Viewpoint 2″ way of looking at the PSSH. In reaction to this, some folks start thinking of some elements of an AI system (for starters, data related directly to sensor readings such as a camera image) as being subsymbolic, a suggestive word implying that symbols are constructed out of nonsymbolic entities. But there is no room for semantically loaded “nonsymbolic entities” in symbol systems. Furthermore, ugly issues arise from this “duality” — what makes some tokens symbols and some subsymbols? Where does the symbolness come from? Perhaps (the thinking goes) the very concept of symbol as used in the PSSH is wrongheaded if it leads to this kind of mystical dualism and complication.

Viewpoint 2b: Going even further, if symbols are clean labeled tokens, it is convenient for many purposes to provide semantics by assigning truth values or, more generally, probability values to symbol structures, and then choosing the processes of the symbol system to be truth-preserving transformations. This leads to symbolic logic and it dates back at least as far as Aristotle. Its productivity has made logic fundamental to many AI approaches, though sometimes you have to dig a bit to reveal how a particular system relies on this foundational way of carving up the world.

Note, though, that there is no inherent reason that symbols have to be simple. One more time: there is no inherent reason that symbols have to be simple. One more time: … oh, never mind.

Newell viewed the designation issue through the idea of a “representation law” — suppose that X is some sort of situation in the world, and T is some sort of transformation. For example, X = me holding a hotdog, and T = me eating the hotdog. The representation law says:

decode[encode(T)encode(X)] = T(X).

Here, encode is a translation from the “external” situation to an “internal” representation, and decode is a translation in the opposite direction. So the law basically says that there is an internal process representing ‘eating a hotdog’ (encode(T)), and an internal symbol structure representing the situation of me holding a hotdog (encode(X)). Applying the process to the structure should produce a symbol structure that corresponds to the external situation after the hotdog is eaten. For example, both in the real world and in the internal representation:

  • I am no longer holding the hotdog
  • I desire an alka-seltzer tablet

This corresponds exactly to the intuitive sense of what I expect from a model, so perhaps the representation law could be the foundation for a theory of modelling. Unfortunately, though, it is kind of frustrating because it sidesteps all of the important practical issues by defining them away. It implies that the causal relationship between the hotdog and the representation of the hotdog is unimportant — as long as the representation law holds, it doesn’t matter where the representation came from. So basically symbols are defined as the things that do what symbols should do. This facile embedding of designation in compartmentalized encode and decode procedures trivializes the process — for evidence, notice that AI researchers talk a LOT about T and X and the symbol structures themselves, but rarely about encode or decode. Still, having this representation law as a starting point is better than nothing!

Of course, these early steps (however profound and crucial), as embodied in their shortcut Viewpoint 2 form, are not The Answer; mistaking this path for the destination led to an overly-optimistic implicit view of denotation as primarily residing in the relationship between things and indivisible symbol tokens. The result: brittle structures too leaky to hold much meaning.

But what of it? Taking that torch and marching with it down the road is our job, and let’s get to it. My little trail: designation follows naturally from the principle that things themselves exist only as their descriptions, so the invention of modelling substrates and simplicity-biased induction methods on those substrates is the natural direction to take. Easy to say! But if we wean ourselves off of an exclusive reliance on truth-preservation as the justification for model transformations, what can replace it? Reductively deducible laws (e.g. F = ma), certainly, but those are rare. Can simplicity + empirical correctness fill the gap? I don’t know.

In 1992, just five years after the William James Lectures which were later published as the magnum opus Unified Theories of Cognition, Allen Newell passed away at age 65, one of a hundred billion tragic triumphs for the great enemy. SOAR is the ambitious and still-active software system embodying the ideas of Newell and his many academic descendents. Check it out, it’s cool.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 20 Jun 2009 @ 02 55 PM

E-mailPermalinkComments (0)

 11 Jun 2009 @ 1:26 PM 

The other day I was walking around the house and, as happens to me occasionally, I noticed that I’m really good at it. You’re really good at it too… try it! Walk around a little bit! The smoothness of movement, economy of action, flexibility of adapting to circumstance, variety of execution, exquisiteness of balance — you’re like a ballet dancer. Stunning!

It’s a seemingly straightforward physical task, and almost all of us (including our pets) easily and completely blow away the finest and most high tech engineering on the planet. That’s always been very attention-grabbing for me. If we could even get close to making devices which move as flexibly, robustly, and accurately as we do, an industry that could rival that of the automobile would open up.

In the summer of 2000 I decided to experiment with a simple legged robot named Bing. I knew absolutely nothing about electronics or mechanics. I got the legs mostly constructed, but did not finish the robot. The geometry was awkward, the linkage mechanism was stupid, the motors were not strong enough, and the joints were poorly designed and wobbly. I had great fun, though, and learned some things. Read about it here if you want.

Years later, in 2004, I discovered the Japanese Robo-One competition and that rekindled my interest in robots that walk. Following the generally-used design ideas for Robo-One competitors, I designed and built a little humanoid using a lot of hobby servos. Again I learned a lot about electronics and mechanical things. I got Bing 2 to take a few steps and perform a few other maneuvers Here’s a couple of very short videos:

Unfortunately, there is a huge problem with this type of robot — it has no sensor feedback, so the only programming you can do is play back canned sequences of movements and hope they work properly. The severe impact of this limitation became clear to me as I experimented with the device, and I put it aside.

Another two years went by and I decided to have a go at a third generation. Bing 3 was designed to have a lot of sensor feedback — pressure-sensitive pads on its feet, position and electrical power feedback from each joint, and a “guidance system” based on a set of accelerometers and gyroscopes. In addition, I built a computer-controlled milling machine to help me build the parts. Once again, I had great fun constructing it, learning a lot in the process — my skills with mechanical and electrical stuff are now at the “not bad” level. I documented the project in a very detailed thread on a hobby robot forum.

I did some interesting programming on Bing 3, including sensor conditioning / integration, and a 3-D geometric representation of the robot’s body that let me compute things like the theoretical center of mass and whatnot. The model included inverse kinematics to help control the limbs.

But two things caused me to stop working on Bing 3: The first and most devastating — which I should really have discovered as early as my first work in 2000 — is the terrible inappropriateness of “hobby servos” for complex control of articulated robots. For those who don’t know, the interface to servos like these is to give it a desired position, and then a controller on a computer chip inside the servo works as hard as it can to achieve that position as quickly as possible and maintain it at all costs. This is a very rigid and frustrating way to control a “simulated muscle”, and I was unable to achieve any sort of fluid, efficient, or flexible motion. This continual “clenching” also caused me to ruin a couple of motors.

The second problem was the sheer complexity of the task. Controlling an articulated robot with sensor feedback is very hard. Bing 3 has 19 motors and a total of 52 channels of sensor feedback. Figuring out a walking method for such a complex device starting from scratch proved to be too difficult for me.

Now it’s 2009 and I’m once again thinking about this task. Time to build more robots. Because of the complexity problem I’m not going to build another “humanoid” though — I want to try to come at the problem from a much simpler perspective, which means starting with just a few degrees of freedom, with a goal of masterful control of a simple robotic system. I will keep the sensor-rich approach from Bing 3, but get some motors that I can operate in “torque mode” — instead of telling the motor what position to go to, just tell it how hard to push.

Additionally, since I am so interested in AI and modelling and learning these days, I want to start thinking about control systems somehow in a more adaptable and less ad-hoc way. Hopefully this change of persepective will be fruitful — instead of viewing the task as a walking problem, I will think about it as a modelling problem, which should also shed some light on the more general issues of how to build effective models.

Since Microsoft has chosen an unfortunate name for its new search engine, I need to pick a new robot name: Zing.

Stay tuned for gory-detailed build reports as the summer moves forward!

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 11 Jun 2009 @ 01 26 PM

E-mailPermalinkComments (0)

 06 Jun 2009 @ 3:40 PM 

I still have some rather obnoxious and academic preliminaries to slog through before I feel comfortable moving on to the specific technology of applied philosophical engineering. Even though this process is both abstruse and fumbling, it is necessary and I can only hope somebody somewhere thinks it’s as interesting as I do.

In an earlier post, I marveled at the thinginess of the universe. But what does it mean to say that things exist? I don’t mean to question the existence of the water molecules making up an excellent tubular ocean wave; I am wondering about the existence of the wave itself as a thing. The purpose of this essay is to answer that question from what I hope will eventually turn out to be a highly practical and useful perspective.

Let’s call the appearance of thinginess emergence.

Now that word (emergence, emergence, emergence!) has a rather poor reputation these days, especially among those with a reductionist world view — what good does it do to hide actual causal compositional mechanisms behind such a mystical and impenetrable label?

Emergence, because it’s slippery and interesting, is debated academically by Philosophers, who view the concept from several different perspectives. Alas, in my view (as I have been explaining), most of the things in the universe (including concepts such as emergence itself) are fuzzy enough to defy such fine description, and so detailed philosophical analysis of this type boils down to ultimately meaningless distinctions between subjective viewpoints. Not to say that all non-mathematical analysis is worthless… but I think it usually can only go so deep before it diffuses into a cloud of imprecision and personal preference. Your mileage may vary, though, and if you’re interested, there are some relatively well thought-out and detailed modern philosophy papers on emergence. Here’s one.

Emergence could be said to be the core of the interdisciplinary science-ish field of Complex Systems because typically emergence is characterized as a difficult-to-predict result of nonlinear interactions between components. From that perspective, the most sexy and interesting sort of emergence involves the creation of “complexity” from simplicity — for example, auto-catalysis and evolution producing complex living creatures from boring chemical soup. John Holland has written an excellent book on this subject, and there are many other researchers justifiably fascinated by the complexity-generating aspect of emergence.

For me, the idea of emergence, beyond the peculiar fact of its ubiquity, is important because it is central to the way I look at minds and modelling. I don’t think this view is particularly uncommon, but I will try to explain my take on it as carefully as I can because many essays to follow will rely (usually implicitly) on this basic viewpoint:

The usual way of thinking about emergence revolves around the idea of irreducibility — for example, even though life is certainly in one sense nothing more than molecules banging together, it is impractical, impossible, or just silly to express the interesting concepts about living creatures in terms of bouncing chemicals, which makes one question the feasibility of reductionism as an ultimately fruitful methodology.

I personally am not so hung up on this point, though: “Prime number” is a perfectly valid bit of emergent structure even though it is completely and simply reduceable to specific “lower level” elements. The important thing is usefulness of the description — that is, the existence of a productive ontology, not so much the peculiar details of specific ontological properties.

Going back to some examples from previous essays, when an asteroid smashes into the moon, the crater that is formed is an emergent entity. When gravity causes interstellar glop to gather into a huge ball which gets so dense and hot that nuclear fusion commences and a star is born, that star is emergent. Democracy as a social structure is emergent from human nature and other aspects of our world.

Going further, all of the things in the universe — from bees to thunderstorms to love to death to money to the category we label “coffee cup” — are emergent.

It might seem that if (literally) everything is emergent, there is no point to the concept. But there is no emergent structure in uniform randomness (maximum entropy) nor in uniform invariance (minimum entropy). For something to fall between these extremes, it must be possible to capture the thinginess of a bit of emergent structure with a descriptionthat is exactly the sense in which things exist: by virtue of their describability.

These descriptions are also things. That the universe can reflect itself in this way is glorious and astonishing. The fact of it is the basis for minds, and that’s why it matters.

We “see” the universe through our own descriptions — sensory descriptions, conceptual descriptions, linguistic descriptions, mathematical descriptions, logical descriptions. But the descriptions available to us through inbuilt capability and training are only a subset of the possible descriptions that could be applied to the universe. I wonder how much of the “real” emergent structure of the universe we can “see” — and how much we miss, hidden in plain sight by the limits of our descriptive ability.

I will call a description of an emergent thing a model. This leads me directly to the next steps in elaborating modelling:

  • What is the nature of the relationship between the “terms” of a description and the thing being described? (the problem of reference, or intention)
  • What types of descriptions (models) are usefully realizable in computer programs?
  • What types of descriptions are generated by brains? Does it make sense to call these conceptual models?
  • How do sensory modalities relate to intentionality and model structure?
  • What aspects of things can and should be described by a model?
  • What a priori and empirical features make for good models?

These and other questions will focus the ongoing project of understanding and applying modelling, and in so doing perhaps advance the miraculous history of the universe reflecting itself in the mirrors of mind.

Postscript:

Altnough I think of emergence in this way myself, it is probably not smart to broaden the definition of a perfectly good word in this way for the purpose of conversation. Therefore I use the more standard-style defintion of with no currently-apparent tractable reductionist theory when talking to others.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 04 Jul 2009 @ 05 20 PM

E-mailPermalinkComments (4)

 30 May 2009 @ 12:27 AM 

Seems I never quite got back to sleep. It turns out that I don’t want to stop blogging, since I enjoy it. What I will do, though, is broaden the range and nature of my posts — rather than attempt to focus on a particular subject matter, this blog will become a more general and personal creative outlet. Many of my topics will be along the lines of things I’ve written about previously since I find them particularly interesting, but others (such as this one) will not. I hope that friends, family, acquaintances, similarly-minded strangers, and bored far-future archeology robots will find something worthwhile here. If not, oh well.

In modern casual Christian mythology, St. Peter tends the gates of heaven, interviewing prospective entrants.

“And what did you do with your life?” he would ask me.

“Well, I spent a lot of time watching progress bars on my computer screen.”

He would nod, having heard that quite a lot. “And?”

“Hmm. I wrote a lot of code.” And sure enough, that’s true. My 10,000 hours and more have been spent coding — the closest thing I have to a theme for my life.

When I was younger, I’d code like mad. I’d code all day. I’d code all night. It didn’t matter really what the program was for, the process of telling a whirling, blinking machine exactly what to do — in every excruciating detail — was completely absorbing to me. I flowed, outside time, undistractable, bodiless.

Eventually, I made a career of it — along with inventing ideas about which programs to write and organizing the efforts of teams of coders. Today I still spend a fair bit of time coding. But over time typing computer code became like driving a car: an almost unconscious reflex… and bit by bit my passion for it withered away. Coding for its own sake no longer consumes or fulfills me. It’s a thing I do well, but the passion is gone. Sometimes (not as often as I’d like) the reason for a particular coding task motivates me strongly, but the process itself is just the means to an end now, a trade skill.

I think it’s common for middle-aged folks like me to realize that passion has dissipated and we get upset about that. Recently I wondered whether I might be able to ignite a new passion around, say, making music, but I’m full of doubt over whether I have the willpower to put in 10,000 hours at this late time in my life, training ear and voice and fingers — or even whether I have enough inborn capability to make such an enterprise worthwhile.

Explaining all that time spent on refining tiny muscle movements to St. Peter also seems problematic.

The general question of how one should spend one’s precious hours does haunt me. There’s a thousand things I love to do, and none that I love to do enough. I do know that musing about things — a lazy dilettante hobby — gives me pleasure, and indulging this desire has never caused me to look backward with regret. So… muse I shall.

It might be that if enough people muse visibly on the internet in the great gooey blogosphere, like the chattering of a million subconscious thought fragments in a global mind, something like a gestalt understanding might emerge in a big evolving cloud of shadowy concepts. If that happens, will we even recognize it?

One thing I find particularly interesting is how difficult it is to communicate ideas about abstract and poorly-understood subjects such as philosophy and artificial intelligence. Reading conversations on mailing lists and discussion forums, I’m always amazed at how people that are earnestly and sincerely interested in these questions often talk completely past each other, as if we all speak our own private dialects sharing nothing more than a few suggestive vocabulary terms.

This great murmuring babble wafts outward into the ether; I wonder if angels and old souls living in heaven can make sense of it, and if this type of music is pleasing to the celestial ear. Perhaps someday I’ll know the answer.

(Note: I’m not actually a Christian, and my references to pop theology are purely metaphorical. Sometimes I think it would be nice to believe such things, but I cannot choose what I believe — and even if I could, that strikes me as a very dangerous skill to exercise.)
Tags Categories: Uncategorized Posted By: Derek
Last Edit: 30 May 2009 @ 04 28 AM

E-mailPermalinkComments (0)

 10 Mar 2009 @ 3:47 PM 

The few people interested in this blog may have noticed that it has gone dormant.

After toying with the “AI problem” for a while it has become clear to me that nobody (including me) seems to have a very promising research agenda for making progress on building artificial minds.  It pains me to admit this… I had always thought that right around now would be the time when we’d start making some truly interesting progress on the fundamental issues underlying cognition.  But I don’t see any evidence at all that this is the case.  Assuming I have a more or less average lifespan, the clock is ticking.  I had really hoped to see this problem solved before my time is over… but I can’t point to a single insight made in the last decade that counts to me as actual progress toward general-purpose thinking machines.

The yearly AGI conference was this past weekend and I’m sure that the attendees would disagree with me, many of them thinking that they personally are on the right track and making tangible progress but unfortunately I just don’t see it.

None of that would bother me too much if I had a clear path of my own to explore, and if I were enjoying thinking about it, but alas that seems not to be the case.  I really do think that there is some potential for the idea of studying the “model-building” process and then wrapping that around on itself to model model-building itself.  That’s very vague, though, and I have not come up with much by way of connecting that to bits of computer code or other concrete applications.

So I’m going back to sleep for another five or ten years, unless some unexpected insight comes to me in a dream or I see somebody else come up with something promising.  In the meantime, I have always wistfully wanted to make music, despite having no training or demonstrated talent.  At some point, age-related degeneration of my joints and vocal cords will make that impractical so if I’m ever going to have a good try at expressing myself musically, I better give it a try now.  I have started a little bit and am having great fun.

Good luck to those of you who are more dedicated to this cause than I.  I’ll check in to see what you’ve been up to and have another shot at it myself around 2014 or so.  I sincerely hope that my grim assessment of the state of the art is wrong.

Tags Categories: Uncategorized Posted By: Derek
Last Edit: 10 Mar 2009 @ 03 47 PM

E-mailPermalinkComments Off

 10 Dec 2008 @ 5:41 AM 
 

Floats

 

Suppose that we take a bunch of bits and break it up into two different pieces (A and B), then interpret the pair as A x 2B, where we interpret A as a fraction between, say, -1 and 1; and B is an integer. This rather complicated way of interpreting a bunch of bits is a powerful idea — yielding numbers with good precision and a wide range of possible values. These are called floating point numbers and are really the workhorses of modern processor chips.

This is because “real numbers” (points from a continuous line of values) are incredibly useful for modelling things, and floating point numbers are for many purposes excellent approximations to real numbers. Suppose that we treat a 32-bit floating point number as a distance measured in inches. The smallest size that number can represent is way smaller than the size of a subatomic particle and the largest size is way bigger than the entire universe. And there’s a pretty decent amount of precision available too (call it 24 bits).

There are two most-common sizes for floating point numbers — 32 bits (called single precision) and 64 bits (called double precision).  The main reason for needing double precision is that 24 bits of precision is sometimes not adequate; it only lets us distinguish between about 16 million values.  That’s plenty for a lot of computation but there are two main cases where it isn’t enough:  those cases where higher precision is simply required, and cases where we have to perform arithmetic on numbers with very different scales.  As an example of the latter, suppose that we want to add 0.001 to a 32-bit floating point value 100000.0.  You’d think that the answer would be 100000.001, but it actually doesn’t work!  With 24 bits of precision, that value cannot be represented, so the answer is just 100000.0.  That is the reason that the single precision version of the Mandelbrot Set program I wrote about in another post cannot zoom in very far.

Besides distances, floats are great at representing other physical quantities (temperature, mass, etc). They are also fine for representing probabilities (although a fixed-point representation is usually better unless the need arises for truly tiny probabilities).

Just as we saw with integers, a huge amount of mathematical structure has been built up around real numbers, and approximating real numbers with a floating point representation gives access for modelling purposes to that big pile of math. To help this along, modern processors include primitive operations for addition, multiplication, and division — and also fancier things like square root, and so on. Large portions of the silicon used to build CPUs are dedicated to fast circuitry for doing these things… if one is interested in keeping a big fraction of the transistors of a CPU busy doing work, using vectors of floating point numbers (either as vectorized data types or data parallel stream processing) is surely the way to go.

That might be a frustrating conclusion if you’re interested in AGI… most peoples’ approaches to thinking about mind are not naturally expressed as vectorized floating point arithmetic. However, as a root level building block of useful modelling abstractions, floats are often excellent (which is why they exist!).

Readers might be wondering why I am plodding along on these really basic considerations, but we have to start somewhere when thinking about using computers for modelling. Our basic fundamental building blocks are bits, integers, and floating point numbers (plus a few other things like null-terminated strings which have a few specialized opcodes). What can we build from these, and what are our motivations for the next layer of complexity on top of raw bits?

Tags Categories: Data Types Posted By: Derek
Last Edit: 10 Dec 2008 @ 01 37 PM

E-mailPermalinkComments (0)

 26 Nov 2008 @ 2:31 AM 

Along with the change in focus, I’m making a couple of changes to this blog itself, including a new visual theme.  I will be adding more content outside the main posts… along the bottom of the browser you’ll find links to some other pages, most of which are barely more than placeholders so far.

The “About” page will give an overview of the stuff I’m writing about.

The “Links” page will include links to other blogs and internet sites related to modelling that I find interesting

Rather than just write about stuff, I really want to get to work on some hobby projects, so over the last couple of weeks I have jotted down some project ideas that I can pick from when I want to start a new project.  These projects will get their own pages as I start working on them.

I’m sure I’ll think of some other pages as well as time goes on.

And finally, I have a new domain name:  supermodelling.net.

Tags Categories: Blog Posted By: Derek
Last Edit: 07 Dec 2008 @ 09 42 PM

E-mailPermalinkComments (0)

 26 Nov 2008 @ 2:28 AM 

It seems to me fruitful to think of intelligence as having three deeply-interrelated components: modelling, language, and “weird stuff” — where “weird stuff” is consciousness and free will and so on: the things that writers put in novels and movies to make interesting plots, and what gets discussed on mailing lists and frightens those with weak ties to reality. Much of that anthropomorphic material makes it seem like AI researchers are writing programs to create golems or voodoo dolls.

I am still fascinated by all aspects of AGI, and probably always will be, but since I want to focus on things that are not quite so broad as “AGI”, this is a good place to start chopping. Besides, I’m not really all that interested in the weird stuff any more.

I am interested in language, but not enough to focus on it.

That leaves modelling. What do I mean by that word?

Modelling is the representation of things, and methods for manipulating those representations which, when interpreted, can be put to good use. Probably the biggest such use is prediction. If the model is accurate, we can extrapolate its parameters over time to predict what the modelled thing will do in the future. We can predict the effect of performing some operation on a thing by performing an analogue of that operation on a model of the thing.

Besides prediction, there are other sorts of reasoning we can do using models — we can make guesses about the origin of a thing by manipulating models of other things (such as components). We can also model abstract sorts of things like categories by which means we can figure out how to recognize them.

And there is much more to the story, I’m sure of it. I’d like to spend time thinking about the question “What is modelling?”. It seems so much more answerable and useful than asking “What is intelligence?”.

The most interesting question of all — one that I do not yet feel qualified to even begin addressing — is how to model modelling itself. This kind of situation where questions and methods wrap around on themselves leads directly to some of the “weird stuff”… and if ever I do work up the chutzpah to study the bizarre things, it will probably be by sneaking up on the idea of modelling modelling.

But not today. Today I am curious about all the ways we have of modelling things on computers. What are they good at? What are they bad at? Why? Do the methods support construction or adjustment of models automatically or must they be completely pre-specified? How efficient are the methods? Can things be modelled using multiple techniques? If so, what is the relationship between the models?

One interesting distinction seems to be whether something is modelled from the inside or the outside. Outside models are based and judged purely on external observations of the subject. A painting is an outside model (though most good artists aspire to be inside), and a neural network (or other statistical regression technique) is usually probably an outside model as well.

Inside models address the “why” behind the subject — a prediction using an inside model is based on interactions of sub-models of things purportedly making up the subject. It is interested in causal relationships.

Outside models do not require deep understanding, focus on surface features, and can often be easily learned. Inside models reflect fuller understanding, reflect a subject’s internal structure, are much more interesting, and are usually very difficult to acquire automatically.

I am tempted to say that animal brains only make outside models; humans have some ability to form inside models. But I haven’t thought about it that much, and it’s unclear that the conclusion is worth anything even if true.

So I’m going to learn more about modelling, and probably model a few things myself.

Tags Categories: Modelling Posted By: Derek
Last Edit: 05 Jun 2009 @ 12 09 AM

E-mailPermalinkComments Off

\/ More Options ...
Change Theme...
  • Role »
  • Posts »
  • Comments »
Change Theme...
  • VoidVoid (Default)
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LiteLightweight
  • No Child Pages...
  • No Child Pages...
  • No Child Pages...