Beginner’s guide to Robosen Optimus Prime

Once and a while I need to write about lighter topics, and the Optimus Prime robot created by robotics company Robosen is a peculiar enough case. Nearly all reviews of this robot are so wowed by the robot’s motorised transformations and animations that there is no room left for criticism, so I feel it’s up to me, as an owner of the robot, to provide a more complete picture.

When I saw an early prototype by Robosen, I dismissed it as just another Chinese bootleg, albeit well designed. A few years later however, Robosen was actually licensed by Hasbro to make the robot in the image of Optimus Prime of the Transformers franchise. Their 2021 “Flagship” version costs about $1000 and stands 50cm tall. Although I had been looking to buy a programmable humanoid robot for some time, both the price and the size exceeded my comfort zone. A year later a smaller “Elite” version was released, and once its price lowered to $700 in 2023, I decided I might as well check that box on my wish list, if ever. I’ve now had the robot for half a year, and these are my insights. Expect an engineer’s perspective, criticism, and undocumented features.


For children?
The high price tag immediately raises the question who this is for. Robosen is a genuine robotics company, but the Transformers franchise is a toy line marketed to children and collectors. The smartphone app used to control the robot remotely is slick and child-friendly. It even makes an attempt at games with tutorial “missions” that have you replicate brief animations in stop-motion style, or by moving sliders to a given target angle for each joint. Completing a tutorial unlocks its sound but not the animation, which would have been more rewarding.

Setting up your own animations with stop-motion or block-based “programming” is very simple, in fact too simple for learning to program, as it does not feature basic “if A then B” components. While doing a few missions is a good way to get familiar with how to create your own animations, later missions range from frustrating to impossible. At that, hearing Optimus complain “You’ve made a grave mistake.” at every slight deviation, and having to wait 10 seconds for the robot to realign itself, 30 times in a row, is not fun in the least. If you just want to unlock the sounds, you can also access them directly on the robot’s hard drive through a usb cable (in folder 21). Of course the voice-commandable actions are great for kids, as is simply steering Optimus in robot or truck form in all directions with the simple app controls. It is an awesome robot to see in action.

But when it comes down to it, the price is obviously too high for a children’s toy, the weight of 1.8 or 2.8kg too unwieldy, the electronics too fragile, and the motorised joints are hazardous for pinching small fingers. These properties also make the robot unsuitable for elementary school STEM education, for which there are better options. Clearly the target audience must be adults with large budgets like collectors or robotics enthusiasts, and that’s where interests start to conflict in peculiar ways.

For collectors?
The grand majority of buyers are Transformers action figure collectors, and the blocky 80’s cartoon-accurate robot is definitely designed for nostalgic appeal. Still, the robot’s price is ten times that of the typical “expensive” action figure, and some buyers see themselves forced to resell the robot at half price to pay more essential bills. As cool as the robot looks, this is not a product that you want to buy on impulse. The “limited edition” tag is no argument: Over 58000 of the numbered Flagship version have been sold over the past 3 years, and over 100000 if we count app downloads. To make the choice clear and plain: You should only consider buying one if you have ample money to spare, as the novelty will wear off, and the lifespan of complex electronics products such as these is unlikely to exceed 5 years, considering non-replaceable batteries, servo wear, and app support.

Many early buyer reviews are enthusiastically positive and describe the robot as “A dream come true“. But, while I was an avid Transformers fan as a kid, my own dreams featured intelligent interactive robots rather than plastic toys. As such, I personally find it strange that the main complaint of figure collectors (aside from the incomparable price tag) is that the robots can not be posed when turned off. When the servos are not running to maintain position, the robot becomes a heavy flaccid ragdoll, for which the only somewhat stable poses are sitting with its back against a wall, or its vehicle form. While this is an inconvenience, there are also solutions (see hardware upgrades further down).

Another commonly voiced deal breaker for Robosen’s other robots is that they have different scales. Having owned various Optimus Prime figures of 5 to 30cm, I do not understand this complaint, but these are solid indications that collectors are much more interested in posable shelf statues than they are in functional robots. A surprisingly large number of collectors even keeps their robot in its shipping box unopened for years to maintain its value, which seems more the act of a trader. But take heed: You should absolutely try out the robot for several days within the 1-year warranty period to check for factory defects, and keep the battery charged, as it reportedly dies after one year in disuse. An unopened box only contains a Schrödinger’s robot.

The download section on Robosen’s website offers optional PC software that allows you to create and share your own animations for the robot to perform. Despite being simple enough for non-programmers, only two dozen user-created animations have emerged out of 100000+ buyers in three years time, 9 of which thanks to one single creator. I really would have thought that creating and sharing custom animations would be the most interesting aspect of the robot, but apparently the collectors community is not a creative community. With the creative aspect unutilised, most owners agree that the novelty of the standard animations wears off after 2 to 4 weeks, although it remains great to show off to friends and family.

For roboticists?
$700 is an exorbitant price for a collectable toy, but it is actually cheap for a 40cm programmable humanoid robot with 27 servos (Legs 5×2 / Hands 6×2 / Chest x2 / Head x1 / Wheels x2). Comparable robots like the EZ-Robot, Lynxmotion Pete, or KHR-1, typically come at a height of 30cm with 16 servos, at a price between $600 to $3500. The typical humanoid robot Nao even costs $16000. Having said that, other robots also tend to come with an array of sensors, whereas the Optimus robot can only detect pre-installed voice commands, its servo positions, and whether it fell on its back or front, with no way for programmers to access this information. “Programming” the robot’s movements does not involve code, but simple block-based or physical stop-motion positioning. As the robot’s programmed actions are always a linear sequence of poses, even the block-based interface is excessive for its purpose.

Roboticists do not consider Robosen’s Optimus a good way to get started with robotics, because it lacks sophisticated robotics operating software, sensory feedback, and access to the electronics. You can’t really expand on the robot’s functionality beyond adding animations and sounds. Setting up complex movements like walking does provide an interesting introduction to the difficulties of dealing with weight distribution, momentum, and floor friction (a factor that Robosen themselves do not account for), but for that purpose you could also get Robosen’s cheaper $400 Bumblebee or $300 K1 robot.

First things to do after purchase:

  1. Unpack the box, even if you are a collector.
  2. Keep the box intact for returns. Robosen only gives you 14 days to do so.
  3. Register your robot for warranty. Serial number is at bottom of the box and the robot’s backpack.
  4. Thoroughly remove all the protective plastic, from the arms as well.
  5. Fully charge the robot before first use. The charging port is behind the number plate.
  6. Close the charging port panel after charging or it could break off.
  7. Install the app.
  8. Place the robot on hardwood or vinyl. It will fall on carpet.
  9. Be ready to catch the robot when you try martial arts Actions.
  10. Try the robot out for several days to check for factory defects.

Voice control
The robot can walk, talk, transform into a truck, drive, and perform a range of elaborate movements, most of which are brief speeches and fighting actions. These can be activated with buttons in the app, or – only when disconnected from the app – with voice commands. Most of the voice commands are single words like “Power” or “Victory”, and the speech recognition works such that it allows you to incorporate them in sentences. E.g. “Transform into a dolphin” makes the robot transform into a truck immediately upon hearing the first word. This flexibility allows you to embellish the commands a bit, which is nice. The downside is that some animations trigger unintentionally. I frequently have issues with the “Are we friends” voice command triggering as the robot sometimes mistakes its own wake-word response “Greetings” for “~~we friends”. Mind to unequip the robot’s hand-held weapons for voice commands, or you risk damage as many of the moves are not designed for it.

There is a limited set of 30 voice commands, a list of which can be found in the app’s settings. It is however possible to replace the onboard animation file that each command triggers, located in the “SpeAction” folder through usb mode. An additional undocumented feature is that you can speak another voice command immediately after an animation ends without having to repeat the wake word “(Hey/Okay) (Optimus) Prime”, although the timing is tricky. Overall, I found the speech recognition quite good. The microphone’s placement ensures good reception from all directions. Judging from other people’s videos, it even does well with thick accents. That’s the benefit of only having to listen for a few words.

The speaker at the back of the robot is of remarkably high quality as well, for both background music and Peter Cullen’s deep Optimus Prime voice. The default volume is loud, but you can change it in the app settings, or with the voice command “Volume up/down”. The volume will reset to default if you turn the robot off through the app, so preferably use the physical power button or the “Shut down” voice command instead.

App control
The smartphone app does its job, but rather impractically divides control over three separate screens: One for walking and driving, one for activating weapon-compatible animations, and one for animations without weapons. Though the interfaces are easy and straightforward, their separation obstructs continuous play. Activities like robot soccer or boxing matches are also out of the question due to a 4 second unresponsiveness after each animation finishes. These factors combined limit the robot to demo purposes. Another point for improvement is that the labeled animations are listed in order of installation date rather than alphabetical, categorised, or customisable, making it difficult to find a specific animation once you have downloaded 30. The robot can walk forward, back, sideways, and turn. Turning is achieved through unnatural diagonal foot shuffling, which is likely to scuff the paint on the feet. I have to say it is a clever solution for the lack of horizontal rotation in the legs, but I do recommend using it on a vinyl floor.


The Device Settings in the app are worth checking out, as they control interesting functions like automatically getting back up after falling, having the robot (try to) turn around after transformation, turning off the robot’s automatic poses that trigger every 2 idle minutes, and in earlier versions, choosing the robot’s walking speed.

Two for one
The added value of this robot over other bipeds is that it doubles as an RC truck, complete with engine sounds. The transformation from one form to the other is a spectacular view of technology in action, although it transforms such that the truck ends up in the opposite direction as the robot was facing. When steering in truck mode, the entire cabin turns instead of the front wheels, which is a clever mechanical solution that does not detract from the experience. The truck has an invariable turn radius of 1 meter, so you can only reasonably steer it outdoors, and the 8mm clearance below the undercarriage practically limits it to smooth terrain. There are only a few voice commands in truck mode: You can tell it to “Roll out” forwards, go back, left and right, but each command only moves it by 65cm before it stops. If it hits a wall within that set distance, it may give you a harmless “wheel is stuck” warning and require a reboot, as a few users have reported. App control does allow for continuous driving, as long as you have an empty parking lot of space to do so.

A small undocumented feature is that upon powering on, the robot detects whether it is in robot or truck form by the angle of its hip servo (threshold at 75 degrees), and straightens itself out accordingly when you turn it on. I found this very useful and safe when you want to store or boot the robot in a specific form. Be warned however that this automatic realignment also kicks in if you try to program transformations yourself, causing the robot to flail and twist around at high speed, so that is not advised.
Robosen also sells a $750 truck trailer separately, which is only physically compatible with the Flagship version of the robot, and overpriced in my opinion.

Dangerous weapons
The robot’s hands can be equipped with a cool axe made of translucent orange plastic, and a gun with an led light in the muzzle that can be turned on with a well camouflaged button. While they are interchangeable, all combat animations assume that the gun is held in the right hand and the axe in the left. Both weapons are very well designed, although the long square handles are a bit fragile. Animations that are not meant to be performed while holding weapons will pop up a warning every time you start them from the app. Thankfully there is an easily overlooked option at the bottom of this warning message to permanently turn the warnings off. There is good reason for the warning however, as some animations would have the robot obliviously whack the weapons against its body, locking the robot, damaging paint, or breaking the weapons. A small microswitch, reflective sensor, or conductive contact inside the hands could have made the robot automatically aware of what it were holding, but alas it is not.

It is recommended not to equip the weapons in voice command mode, as you may unintentionally trigger a weapon-incompatible animation or transformation. For instance, “Engage” triggers a gun attack, but “Attack” would collide with weapons equipped. Should you find that the robot is about to make a hazardous move, or has fallen and is about to do worse to itself, you can cancel the Action (again undocumented) by briefly pressing the power button on the robot’s back. This will make the robot stop and calmly return to its default pose, although this in itself may cause the lower arms to scrape along the abdomen.

Programming animations, aka Actions
The possibility of downloading and installing new animations on the robot was a great idea that could have greatly contributed to the robot’s longevity and community. However, the app’s over-optimistic promise of “new actions every Friday” could not be further from the truth, as additions have been years apart. Three recruitment programs and contests for creators were abandoned as quickly as they were announced. Things might still have worked out with a large community creating and sharing custom animations, but despite a huge userbase, the number of actively creative users can be counted on one hand.

User-created Actions can be found on Robosen’s hub site, the TFW2005 forum, and, because nobody seems to have shared any new Actions in two years, in my webshop, given that they add play value to a $1000 product. Mostly though, creating new Actions will be up to yourself. I will outline the four methods of programming below.

Method 1: App manual Stop-motion

The fastest and most intuitive way or creating Actions is labeled “customize” in the app. In this mode, you manually twist and turn the physical robot into a pose, click “Sync” to record the pose, and then click an empty slot on the left to add the pose to your sequence, much like a stop-motion animation. A handy rotatable preview of the pose is shown on-screen whenever you press “Sync”. A separate screen allows you to lock or unlock the individual joints, which is especially helpful if you want the robot not to slump through its knees while you work on the upper body, because this thing is heavy. Another aid against gravity is to have the robot lie down on a soft pillow. It is best to use this programming mode on a tablet, as the lock buttons are clumsily small on a smartphone.

While it is very easy to set up animations this way, it is also the most limited method. Joints can not be angled with precision, the transition speed from pose to pose is default, and you can not insert an extra pose into the slots without first moving all the poses thereafter one by one. The animations that are made in this mode can only be played in this mode. They are stored locally on your smartphone in a format that is not exportable or compatible with other programming methods. Another issue to be aware of: Both the autosave and manual save will not register the last few frames of your animation if you do not play the whole animation first.

I’ve always been taught that manually rotating servo motors is bad for them, as a lot of force is applied to the gears, some of which are plastic. It also causes the motors to generate electrical current in the wrong direction, the effect of which is visible when the robot is off and the feet light up when you move the legs. Mind that you can not move the head manually, as it is fragilely attached with a single screw.

Method 2: App Block-based programming

Block-based programming is commonly found in software for children, like Scratch. Instead of typing code, you drag and connect blocks that each represent a certain function. Normally this includes (sensor) triggers and effects, but Robosen’s app only has three types of blocks: Action blocks (purple) that represent a pose, Loop blocks (blue) that can repeat a set of Action blocks, and Joint blocks (green) with which you can set the angle of each joint by dragging a slider. It is best to learn this through one of the tutorial “missions”. Once you discover the Joint blocks hidden in the second tab on the left, it is clear and simple to set up an animation by stacking Action blocks and filling them with Joint blocks. You can also change the speed and delay values of each Action block, determining how fast the robot changes to that pose, and how long it remains in that pose.

Though simple and well suited to touchscreen devices, I find this the least convenient interface. The “play” button makes the robot perform the animation you’ve set up as a preview, but there is no visual feedback while you are editing the joint angles, making it unclear whether a value should be higher or lower to move a limb back or forth. A single Action block can easily take up the height of the screen, causing much scrolling, as does the spacious list of 27 joints. The concept of dragging blocks doesn’t really add any value to just having a linear list of Action blocks on screen. The three advantages that it does have over the manual stop-motion method is that it allows you to set exact angles and speeds, and move the robot’s head. Just like the stop-motion method though, Actions created through this method are not exportable or compatible with other programming methods.

Method 3: PC Block-based programming

The PC software is (or was) downloadable from Robosen’s website, and was also used for their earlier robots. Until recently (2024) this was the only way to create Actions that could be exported to .sh files and shared with others, who could then copy them to their own robot. The interface is very similar to the app’s block-based interface: There are Action blocks and Loop blocks that must be stacked to set up the poses in your animation. The difference is that when you click an Action block, a complete list of all joint angles is shown for that pose. The robot, which must be active and connected through a usb cable, will also immediately change position to the angles that you enter. Though this is very useful to live preview the pose and balance, it is also hazardous if you click on Action blocks that are further apart. The robot shifts positions at maximum speed, risking falls and collisions between its limbs and body. There is a trick though: If you load an Action, disconnect the robot, then click “cancel” instead of “okay” when asked to save, you can switch back to the Action’s tab and continue editing without the robot.

The rest of what I have to say will be in the form of a bug report:

  • The “play” button helpfully makes the robot perform the entire animation, but can randomly skip a step.
  • The “play” preview is performed faster than after the animation is uploaded to the robot.
  • The “delay” values pause the robot after a pose during the “play” preview, but pauses the robot before the pose after uploading.
  • The “speed” values are not the speed but the transition duration of a pose, in an arbitrary nonlinear scale of 1 (brief) to 100 (long), while the duration of value 0 equals value 30.
  • The indicated maximum angle of the ankles exceeds their physical limit. Setting them beyond 55 degrees backwards will overheat the servos and cause permanent damage.
  • Although the software includes the option to open .sh files, it does not import their values correctly, making the robot flail itself into stasis lock if you dare to preview it. Only .shr files are safe to open.
  • The “usb mode” button disconnects the robot rather than activates usb mode.

Despite its perpetual beta phase, the PC software is still better than the app’s programming methods, because it gives you more oversight and precise control, unhindered by touchscreen navigation.

Method 4: Online editor

Launched in March 2024, the online editor is by far the best software to use. It contains all the features of the other programming methods, but without the unnecessary block-based interface and more like it should be: Animation software. It seems Robosen has incorporated all the feedback of the past years, including my suggestion to have buttons to lock/unlock entire limbs at once instead of every individual joint. The number of options can look a bit daunting at first, but you will likely only be using your favourites.

At the bottom of the screen are the thumbnails of your current animation frames, with each pose clearly depicted. Clicking on one will change the large rotatable 3D-model on the right into that pose. With the options underneath the 3D model you can choose whether you want the physical robot to pose along as you edit the 3D model, or only upon clicking a button. Poses can be set up in three ways:

  1. Highlight a joint on the 3D model by clicking it, and move it with the mouse wheel.
  2. Open the “Adjust” menu on the left, then drag or click the sliders, or type the angles in degrees. Left and right joints are neatly displayed side by side, and you can link left and right joints to move them in unison.
  3. Click the “live preview” (circular) icon underneath the 3D model to switch to manual stop-motion mode. Use the lock icons at the top of the screen to unlock a joint or limb, and then you can physically move it around. To load the physical robot’s pose into the editor, click the “sync real action” button underneath the 3D model.

Lastly, you can also load existing Actions with the upper left menus, and browse their individual frames displayed as thumbnails. You can click any thumbnail to add it to your animation sequence. This makes it extremely easy to compile new Actions from existing ones.

As the online editor is brand new, there is still room for improvements. Currently, if you select a thumbnail and edit the joints, you can only commit the change by adding it as a new Action, then deleting the old one. Copying and inserting thumbnails is also a little clumsy through a tiny pop-up menu where a simple copy and paste button would be easier. While you can save and edit new Actions online, it is not yet possible to load Actions from your robot to the editor. I am told this will be fixed with the next firmware update. Overall, the online editor is an excellent piece of design. The only downside is that it requires a Bluetooth connection between the browser and robot, so browser compatibility is limited. Google Chrome is your best bet.

Hardware upgrades
The robot has a few small design flaws where fans have picked up the slack:

Display stand
To display the robot upright when powered off, you can make a basic stand quite easily, or get a 3D-printable stand by Brandon Thomas. Less recommendable is buying Robosen’s $250 “AI base” stand that emerged two years after the robot, in which the word AI of course only refers to speech recognition, as in so many products. In typical fashion for Robosen, the base stand was released with half-finished software, with voice commands and Amazon Alexa integration still not operational half a year later. What it does do is keep the robot standing upright and plugged in to charge. It also adds a few impressive animations complete with a light and sound show, as having the robot suspended in mid-air enables it to act out running and flying movements. A downside is that it only connects to 2.4GHz Wi-Fi, which is on the way out. Mind that if you turn off the robot with its own power button while suspended on the AI base, it will kick hard against the stand’s pillar.

Side window covers
In robot mode, the shoulder joints rub against spring-loaded window panels on the sides of the truck. This makes a harmless creaking noise during the robot’s idle “breathing” animation, but also scrapes off the paint over time. You’ll want to cover the spring-loaded panels with tape or smartphone screen protector to avoid this.

Foot covers
The paint on the bottom of the feet, as well as on the knuckles, gets scuffed with use. This happens because the robot is meant to be used on a smooth (typically hard) surface to not trip over the friction, and most animations slide the feet across the surface. One fan has been selling felt foot pads as protection, which seem to be well received. Other solutions are to cover the feet with broad tape, or placing the robot on vinyl, which works for me as I’ve not had a scratch on the paint. The friction of vinyl does restrain sideways spreading of the legs by a third, but all the animations remain stable and it provides slight cushioning for falls.

Automatic head panel
The panel that is pegged into the back of the robot’s head can be removed manually for a better look, but one fan has designed a 3D-printable head panel for the Flagship version that automatically flips down when transforming into a robot. I later designed a different panel for the Elite version (shown in image) with a hole to keep the microphone clear.




Stickers
Toyhax has an unofficial sticker set for the Flagship Optimus Prime for $45. The stickers are durable and have an aluminium shine to them. The headlight stickers and red dots for covering the large black screws really perfect the truck form. The rest of the stickers however are superfluous. Tweezers are highly recommended for applying them.

Screw colour
The black screws really are a bit of an eyesore in truck mode. As an alternative to Toyhax’ stickers, you can take some red tape or sticker paper and use a hole puncher to make your own screw covers. You can also use a red paint marker suitable for metal, as I did, although primary red is a little off-colour, and paint markers stain easily when painting.

Headlight covers
Since Toyhax does not have headlight stickers for the Elite version of Optimus Prime, I designed 3D-printable frames to add some shape to the otherwise plain rectangular lights, front and rear.

Gun flash
One of my own modifications. You can solder a $0.50 tilt sensor inbetween the gun’s wires, which will make the LED in the nozzle flicker like fire as the robot shakes the gun up and down during shooting actions. It also saves the cell battery when the gun is lowered, as it depletes fast.

Poor customer service
The first thing you should do when getting a Robosen product is to register it for warranty through a form on the Robosen website. With a complex motorised mechanism like this, it is likely that a technical problem will occur sooner or later. The second thing you should do is lower your expectations. US customers have had good experiences with straightforward return-and-replace requests, typically handled within two weeks. Anything more complicated however appears to be out of their league. Customer service’s grasp of English is poor, they typically reply with a single sentence, and the more words you use, the less they seem to understand you. Though technically polite, their directness and lack of reassurance can make even this Dutchman blush. They appear more focused on establishing the terms of warranty than solving your problem, and as one customer aptly put it, “respond whenever they feel like it”.

These are the unwritten rules:

  • Customer service will always ask to send a video showing the problem, regardless the detail of your report.
  • Customer support typically does not keep you updated on the progress of your case.
  • Robosen will not allow you to perform repairs yourself. They do not send or sell replacement parts, other than accessories and the charging port panel (which tends to break in falls), though they have made rare exceptions in countries where they do not have a repair facility.
  • As per the warranty terms, you have to pay for shipping the robot for repairs, which is pricey due to its size and weight.

My impression is based on the experiences of vocal customers on various boards, as well as my own. It is impossible to tell what percentage of robots is actually defective, as unhappy customers are more likely to post online than others, but often the same customers were equally vocal in happily announcing their initial purchase. I have read of perhaps 100 malfunctions over a userbase of 100000+, but there will be many more cases that have been resolved over email or through refunds that do not find their way to Robosen’s Facebook page or other online discussions. The amount of problems one encounters also greatly depend on usage. Most buyers are collectors, who are so cautious about keeping their robot in mint condition that they keep it boxed up most of the time. People who use the PC software and try out new Actions run more risk. The first version of the “Kung Fu” Action cost many users a broken charging port cover as the robot fell during kicks. On the one hand, there are users with six of Robosen’s robots and not a single problem, but on the other hand, there are users who each had four robots and half of them defective out of the box. I find it difficult to chalk that up to bad luck. The most common malfunctions, the robot reporting that a cable is disconnected, or the battery not charging, may well be due to reportedly weak soldering in combination with fast movements, because the cables themselves are plugged tightly into their sockets. Although the technical issues that I experienced are uncommon because few people use the PC software, I do think that my anecdotal evidence below can serve as example of what to expect from customer service.

  • August 2023: Two burned ankle servo motors after 20 minutes of PC programming, cause unknown at the time.
    US tech support request: Next day response, asking for a video, then a week of radio silence until I contacted them again. Every time their reply was nearly instantaneous, but followed by another week of silence until I mailed them again. After 6 weeks of this, they ended up sending me foot servos instead of ankles. Upon complaint, they proposed a refund for the shipping costs, which I received 2 weeks later. No compensation for border customs fees though.
  • September 2023: Self-inflicted bricked Bluetooth connectivity by altering a file, unable to restore because the “usb mode” button in the PC software doesn’t work.
    US tech support request: Response in two days, with a custom software fix uploaded but blocked by Google Drive security. Customer support naively placed a review request. After two weeks without change I proposed uploading to Dropbox, which they did 5 days later. Unfortunately the software did not work, and I sought help from an administrator of Robosen’s Facebook page, who fixed my problem in 15 minutes (For which again many thanks).
  • October 2023: Broken gun handle due to a weapon-incompatible voice command.
    EU After-sales service request: No response.
  • January 2024: Inquiry for buying a replacement gun.
    EU After-sales service request: No response.
  • Februari 2024: One overheated ankle servo motor, due to range stated in PC software exceeding physical limit.
    US tech support request: No response.
    US robosen RMA form: No response.
    support@robosen.com: Single-sentence response after 1 month to ascertain warranty terms. Follow-up another month later, agreeing to ship a servo to the Netherlands (EU). Follow-up 3 days later asking for telephone number, before shipping out from China.

How much a repair will cost varies. Within the one-year warranty terms, you only pay shipping costs. Outside of warranty, I have seen quotes of $100 for replacing the head, $125 for replacing a battery, $225 for bigger issues, and $500 across Canadian borders with shipping included. EU border customs also charge a 21% fee over the estimated value of a repaired item, adding another $210. Robosen would do its customers a great service if they sold individual parts so that the robot could be repaired locally (with the understanding that this voids warranty), but even the hidden parts shop only serves USA, Canada, and China. Neither the batteries nor servos can be found anywhere online.

In summary, Robosen’s customer service is barely responsive since recent (2024) cuts in staff, and not sufficiently knowledgeable about Robosen’s products to troubleshoot directly. It’s a roll of the dice whether you get a replacement robot in 5 days, or never achieve contact. RMA’s in the US and China are generally settled quickly, but otherwise you are better off returning a defective Robosen product directly to your seller. Robosen’s understaffing of customer service frankly astounds me, as surely a high-budget product founded on a nostalgic franchise will fall or thrive by the satisfaction of returning customers.

Final conclusion
Although the Optimus Prime robot is a very cool controllable robot, with customisable animations and sounds for users with some creativity, functionally it is more a fragile animatronic than it is a STEM robot or action figure. If you live in the US, love Transformers, and the price is no issue for you, then you’ll love it for upwards of a month, but you should not expect a decade of longevity. Otherwise, there are more versatile humanoid robots available with better support, or at least I would advise to wait half a year after first release of a Robosen product, until they have resolved the software issues.

The AI art revolution, pt.2: Benefits

In part 1 of this article, I’ve detailed the natures of AI image generators, art, and artists, and laid out some of the issues with the technology. Part 2 is about where we go from here. The tech exists now: What can we do with it? I read many tech-bros suggesting AI image generators as a super useful tool that artists should embrace, but they tend to be unfamiliar with the processes of making artwork. Having spent a decade in artistic communities, I can think of several applications that I and other artists could probably get behind, if the moral issues were settled.

Artist’s block:
Brainstorming with AI imagery is the most suggested yet least required application, but I will specifically note the “artists’s block”. Just like writers, artists sometimes run out of ideas or motivation to draw the same old, which can be a real obstacle to one’s productivity and consequential sense of purpose. The range and unpredictability of image generators could be of particular benefit here: Artists could type in rough ideas, or use some service that is a combination of word salad and image generator, and seek inspiration in the randomness of the machine. It may serve better than browsing a portfolio site of perfected artworks that leave little to the imagination.

More ideas than time:
When not suffering from artist’s block, creatives tend to get new ideas faster than they can put them to canvas. Not every idea is of equal importance however, some are just cool, conceptual, or contemporary social commentary, and don’t require the same full-time dedication as a masterpiece. Instead of leaving conceptual sketches at the bottom of a pile, one could put these ideas to the machine and have a shareable coloured image portraying the gist of it, satisfying enough as a momentary outlet.

Concept pitching:
The beginning of a commissioned artwork is always to figure out what the client wants, which may not convey well in words. To guide the client, an artist typically makes thumbnails and sketches of different ideas to choose from. The back-and-forth in this early stage of design can be frustrating when clients request revisions upon revisions without committing. The creation of these “sketches” could be delegated to AI image generators, who are already set up to generate multiple crude versions from a description. It may not take much less time, but it may be less taxing, as it is not the artist’s personal creations that the client is shooting down.

Reference images:
Artists often use photos as references for subjects that they can not draw entirely by heart, such as vehicles and horses. Since Google Image Search removed the option to download images directly from search results in 2018, there is demand for new sources. AI-generated reference images could in theory be a step up, as their results are not limited to existing images with regards to pose and perspective. In practice however, AI image generators in 2023 are too slow and erratic for the demands of artists, taking a minute to produce hands with seventeen fingers. Perhaps this will become more viable in the future.

Backgrounds:
Most artists are skilled at drawing people, but less practised at backgrounds. This is partly due to the art world’s preoccupation with the human form, and partly due to the greater diversity of environment objects. In many cases, a background’s role is only to give a general impression of the surroundings, to set a mood, or just to fill the void. Drawing detailed buildings and vehicles is an expertise of its own that can take as much time as drawing characters. I think many character and comic artists would welcome an option to auto-generate backgrounds, so they can focus on the humanity in the artwork.

Art theft:
Though of questionable nature itself, AI art may reduce traditional art theft by a lot. Would-be artists with more time than skill are already getting the attention and admiration they want by publishing AI arts wholesale, without having to impersonate other artists. Event organisers that previously repurposed downloaded images to promote local dance parties, can now have something equally ill fitting AI-generated for them. Bootleg sales of e.g. T-shirts prints may in part be replaced with easier to acquire AI arts, but probably less than other forms of art theft, as existing arts still come with assurance of popularity.

Low budget clients:
A common plague upon artists are clients who demand masterpieces for the ridiculous price of $15, and will not take no for an answer. Rather than get into an argument over the value of educated skill, artists can now give such clients exactly their budget’s worth: An AI-generated image of Spyderman, instead of a hand-drawn Spoderman. I expect this tier of clients to continue to exist, because not everyone will want to sign up or pay for a good AI image generator.

Animation:
This one may be controversial, but there is one task in the industry that I consider inhuman: ‘Tweening, i.e. drawing each frame of animation be-tween key frames. Hand-drawing 30 images for a fleeting second of animation is an arduous and costly task, and that is why cartoons use a less fluent 12 frames per second, have simplified character designs, or 3D animation. Due budget restrains, tweening is often outsourced to Japan or Korea where animators work 12 hours a day, sleep in the office, and build up Repetitive Strain Injury. Using a mix of image generating AI and other existing algorithms, it is possible to delegate more than half the animation frames to machines, enabling higher framerates, reducing labour intensity, and potentially upgrading grunt-work artists to lead animators. After mass layoffs, though…

Adoption of AI art generators so far

None of the dozens of artists I know personally have started using AI image generators in their work. There are stories of artists whose superiors decided to switch to AI, whose jobs have now lost meaning to them as the creative aspects vanished. There are well established artists who have taken to using AI to increase their output of NFT arts (digital images that you pay to have ascribed to your name) , but it seems to me that the selling point of NFTs’ uniqueness conflicts with the infinite variations of AI-generated arts. There is a small percentage of artists that is happy to use AI image generators for brainstorming, but most established artists are still strongly opposed to using AI image generators, not in the least because using their personal art to train AI without asking was a bad way to start any kind of collaboration. Let no-one be fooled: AI art generators were not meant to be tools for artists, no artist asked for them. They were made just because it could be done, and it’s up to us to decide how to use them.

Fervent adopters of AI art instead seem to be newborn enthusiasts flooding art sites with such unique works as “Untitled”, “Untitledsdfsdf”, and “2349423037710”, whilst wielding more fees and Patreon subscriptions than other artists. It seems AI art has become the next get-rich-quick scheme, and given the amount of views and followers AI art gets, I’m not sure they are not temporarily successful.

Art portfolio sites have implemented AI art as a category of its own, but their recommendation algorithms still favour them by virtue of sheer quantity. Some art sites have chosen to credit the name of the algorithm instead of the name of its user, which seems reasonable, given which has the most influence over replicable results. One well-known portfolio site, Deviantart, introduced an AI art generator of their own in 2022, trained on all existing artworks of its users without asking for permission first. After mass outrage, they changed the default setting to being excluded from their AI’s training dataset, and introduced a technical tag, “noai”, as a measure to discourage AI art scraping from Deviantart. Although a few other portfolio sites have adopted their “noai” tag, it has no actual effect, even according to Deviantart’s own terms of service:

“Users acknowledge that by uploading Content to DeviantArt, third-parties may scrape or otherwise use their works without permission. DeviantArt provides no guarantees that third parties will not include certain Content in external data sources”

So far, print-on-demand merchandise sites like Redbubble and Teepublic seem to be spared from flooding because they require high-resolution images, but it won’t be long before AI art enthusiasts tie image generation to design submission, as the most successful sales tactic on merch sites is to have hundreds of designs. Soon after, I expect the servers of print-on-demand services to collapse under the quantity of user design submissions, forcing them to drop user submissions entirely, and replacing them with AI merch generation directly to consumers, as other parasitic sites already do.

Overall, we are seeing large increases in mediocre contents and little to no meaningful adoption by established artists. I would like to see AI image generators used in the areas I mentioned, but the way forward is still blocked because the AI art revolution started off on the wrong foot: Exploiting the work of existing artists instead of supporting it. Until justice is done to that initial foul, artists will not collectively adopt AI art. Even if some artists wouldn’t mind delegating part of their workload to AI, the art world is a community built on mutual admiration and support, and using other artists’ work without permission, even indirectly through AI, does not sit well with them.

So, in what will be part 3 of this exceedingly long article, I will finally address copyrights.

The AI art revolution, pt.1: Framework

Suddenly, in the year 2022, we are a long way from persistent claims that “machines can not create art”. It was only 2014 when the Lovelace Test of AI creativity was proposed as the new stronghold for humanity’s intellectual superiority. It was only 2014 when a researcher at Google invented Generative Adversarial Networks (GAN) by pitting an image-generating AI against an image-recognition AI in the role of art critic. It was only 2015 when Google’s DeepDream project had an image recognition AI enhance specks until they looked like hallucinations of dogs. It was only 2017 when photorealistic faces generated by GANs fraught the internet with deepfakes. Today, image generating AI algorithms like DALL-E, Midjourney, and Stable Diffusion can turn any brief description into a visual, enabling anyone to create images in seconds. As whimful AI-generated art started drowning out 90% of all other online content, there was bound to be a backlash from both artistic and AI communities. We are at the epicenter of an AI art revolution, and as the world struggles for answers on how to deal with it, I want to address what this means for art and artists.

You see, I used to be a comic artist before I took an interest in AI, as witnessed by the half-hour doodles that occasionally illustrate these articles. I started drawing as a talentless 5-year-old who couldn’t even draw a straight line. Refusing to let that stop me, I spent 10 years tracing animal encyclopedia and drawing from life until I could draw subjects from any angle by heart, then another 5 partaking in art contests before my illustrations were published in magazines. After that, I joined a collective of artists to create, publish, and promote manga-inspired comics in a society that thought ill of it, until we gained acceptance. Hence many of my acquaintances are illustrators and animators by profession, some the creators of award-winning graphic novels and your childhood’s Nickelodeon cartoons. They have concerns.

As there are many aspects to AI art, this article will come in two parts: The first to clear up the current situation, the second to suggest ways forward. Throughout, I will also try to generate a specific image myself.

The workings of the technology

I don’t think a mathematical explanation really helps anyone, so instead imagine a neural network AI as a sea of numbers. The numbers represent currents of various strengths. Pixels of a cat image are scattered into the sea and carried across the currents. The pixels that do not sink along the way eventually beach on cat island or dog island, two categories. This is repeated with millions of cat and dog images, while the strengths of the currents are semi-randomly adjusted until the pixels reach the correct island most of the time. The resulting network is so “trained”, i.e. its pathways are optimally configured, to recognise cats and dogs by the sum of pixels that end up in each category.

To generate an image of a cat, the network is given an image of random static noise and a description of a cat. It then uses its ability of “recognising” or hallucinating the described features to try and remove noise from the image in steps, until all that remains is what it thought to see. It’s like when you stare at speckled tiles and start seeing faces in them after a while, then focus and imagine more details. A game of connect-the-dots, but with dots all over.

Rather than remembering parts of the images on which the AI is trained, it learns the features they have in common, after which it no longer needs the original image database. An abundance of brown pixels may represent the colour palette of cats. The frequent location of brown pixels near the center of an image may represent where cats are commonly depicted. A combination of a dark and light pixel may represent an edge, and a cluster of contrasting pixels may represent the texture of fur.

Artificial neural networks are known to memorise small-scale features like colours and textures better than large-scale concepts like shapes and compositions, because the shapes of cats in the training examples have far more variation than their colours. This is why image generating AI often produce disfigured shapes, or paws with an excessive number of claws, and still assume to have painted a “cat”, whatever that is. It is like AI is trying to recreate the brushstrokes without knowing the subject matter. The basic structure of artificial neural networks may be inspired by biological ones, but it is misleading to say that AI works just like the human brain, or that image generators work the same as artists. In practice, the field of AI is all about taking shortcuts to achieve the same results.

One of the current advantages of human artists over AI is their ability to edit parts, as inevitably requested by clients. With most AI image generators, one can only try to type a better description and spin the reels again as the image is reformed in whole with a degree of randomisation. However, given that AI like face recognition are particularly flexible with scale and position, I expect that it will not be a year before AI can also redo a specified area and blend it with the previous result. Indeed, some image generators like DALL-E are already capable of this. Inherent flaws of statistical algorithms will more likely remain: Results will be worse for rare subjects, and unrestrained by logic, but the majority will eventually be “good enough” for use, given the vested interests in making this so. Remember that “machines can not…” is the very phrase that birthed creative AI.

The nature of art

More philosophical writers have attempted to define “art” without consensus. Instead, I shall acknowledge all established forms of art, from Rembrandt paintings to a banana stuck to a wall in an art exhibition. Judging from the public discourse about AI art, I think it is most useful to make a distinction between expressive art, and impressive art. In expressive art, a visual carries a message or feeling that the artist wanted to express. In impressive art, a viewer unilaterally derives an impression or feeling from a visual. Both can be meaningful, only the source of that meaning differs. A wanton splatter of paint on canvas is unimpressive, but may be an artist’s attempt to express a feeling. A commissioned Rembrandt potrait is impressive, but carries no message. AI art can be acknowledged in the category of impressive art: The computer meant nothing by it, but people may be impressed by it nonetheless.

Traditionally, skill and creative effort have been key characteristics of art, but as touted by the producers of image generators, AI art requires little of either. Even though today’s image generators rarely produce good results without manual refinement, this won’t be the case in the future. Some experiece with the digital tool is needed, naturally. It takes a few tries to recognise that the AI does not understand language so much as keywords. But the effort is minimal when compared to a week’s process of composing, sketching, revising, and painting, not to mention the decades of training of a human artist. In this regard I find AI art most analogous to photography: It takes knowledge of the tools, and a taste in composition and lighting, but the actual process of creation is reduced to the press of a button and a choice of results. Although AI-generated art mimics the styles from which it derives, it had best be treated as a seperate category of art, just as photography. The method is vastly different, and with that, the criteria by which people judge and appreciate it. That simply means art contests should create a seperate category for AI art, art sites should implement new tags and filters, and AI users should not imagine themselves to be on equal footing with traditional and digital artists, any more than photographers imagine themselves to be painters. Art is not just what you make, but also how you make it.

The makings of an artist

I think it is easy enough to agree that an artist is someone who creates art. Writing a description of an idea and then reviewing the returned propositions is more accurately what a client does when commissioning an artist to make such-and-so. That would make AI users clients or curators, while the AI image generator fulfills the role of artist, literally the one trained to generate art. Writing effective text prompts for image generators may well be a skill, but if that were an art, then we should also consider everyone who has effectively used Google Image Search an artist, and the search results their personal creations.

Of course there is a grey area, wherein someone takes an initially AI-generated image, and edits it to perfection. Depending on the amount of change and creative choices, that makes them an editor or artist, as digital editing requires many of the same skills as digital artists. The question then merely becomes whether they are good at it.

Professional artists have noted that AI art tends not to be good art, because those who outsource the creative process typically lack the experience and resulting intuition to recognise what makes an art work. Composition, lighting, subject matter, everything appears average, which is what one might expect from an algorithm that learns statistically average features. Arts generated by the popular NovelAI algorithm are quickly becoming “that style”: Overlit, soft shaded, soft edged, centered characters with a nondescript out-of-focus background. It takes after some of the best artists that it was trained on, but the quantity of lacklustre results reduces a previously top tier style to the literally ordinary, gratifying no-one.

Artists be damned, apparently

It is all too easy to assume that technological progress is good and therefore all detractors evil, but technology can be used for either, and thus the concerns of artists should be heard. Contrary to popular opinion, artists are not opposed to the progress of technology: They have welcomed digital painting and 3D modeling in the past. What they take issue with is the sociological consequences.

Let’s start with the most apparent consequence: The sudden flood of competition from cheap automated art threatens the livelihoods of artists, many of whom already work below minimum wage. The prospect of losing one’s income will put anyone on edge and heats up the debate considerably. Yet surprisingly, most professional artists actually say that they do not (yet) feel threatened, because the job of artist is much more than making a pretty picture. One of their main tasks is to figure out what the client wants, which is rarely what the clients says. Artists have to research the subject matter and negotiate revisions to create something that works for both the client and their target audience, and convince the client of this. Even as AI art improves, few serious paying clients will fumble about with AI directly, because the reasons they needed artists remain: They are not experts in visual matters, and they have other business to attend to.

On the other hand, freelance artists that cater to low-budget commissions may take a hit, as for example stylised portraits or abstract book covers can be easily done with apps. Even though parallell breakthroughs in text generating AI will increase the output of books and the need for illustrating them, I would expect clients who find their way to text generators to also find their way to image generators. Standing out as an artist for hire on sites like Deviantart and Artstation has also become more difficult among the 50 to 1 output ratio of AI art. As a counterconsequence, I am certain other art portfolio sites will rise to exclude AI art entirely, which could make professional artists more visible to paying clients as target audiences are split. Either art sites have to change, or artists will have to migrate.

A more moral issue is that AI image generators are trained on existing artists’ artworks. Although image generators do not literally collage bits and pieces of existing arts (I will come back on copyrights later), they undeniably did use artists’ work for commercial purposes without their knowledge and consent, and not in the way that a human artist gleans the occasional element from another’s art. An image generator is not a human by any means. The primary sources of inspiration for human artists are still nature and their own experiences, while AI users can literally specify to have images generated “in the style of [popular artist]”. Outdated laws notwithstanding, plagiarism is a morally acknowledged wrong, a violation of honour and respect. Some artists do not have as much a problem with the generated art as they do with it being tied to their name. What AI art proponents don’t seem to realise is that art is much more personal than any other product: It is an extension of the artist’s thoughts and feelings. Their connection deserves respect, and that is exactly what has been absent in both the training and publishing of AI art. A show of respect could go a long way.

Let’s say that AI art was only derived from 16th century art, and there was no danger of job loss, would AI art still be opposed? Judging from the arguments used when discussing the above issues, I’d say yes. In fact, the supposed “first” AI-generated portrait was based on 14th to 20th century paintings, and its auction was ill received by traditional and AI artists alike. I think the core problem with AI art is our collective belief in merit. We instinctively believe that effort deserves reward, and logically so, as this motivates people to continue their efforts and contribute to the community. AI art takes the collective effort of artists, and displaces the reward to people who have dedicated relatively little effort. None see this contrast better than artists who have spent their entire lives honing their skills, worked and paid their way through art school, and still ended up undervalued. To make things worse, those same lifetimes of effort are now being used against them, competing with and devaluing their own art.

Of course the inventors of the technology meant for none of this. They merely picked up the gauntlet thrown by philosophers who said machines could never create art. Others ran with it. Artists can run with it too, but that does not stabilise the situation. I will address the positive possibilities later, but first I must point something out.

We did not stop to think if we should

History is a foundation for the future, but not a guarantee. The first industrial revolution automated back-breaking physical tasks, and coal miners became mechanics and operators, better jobs. The second technological revolution automated painstaking agriculture and infrastructure, and farmboys became factory workers. The digital revolution automated mind-numbing administrative tasks, and bank tellers became customer service agents. The online revolution automated 24/7 on-demand remote services, and cashiers became warehouse workers at Amazon.com, worse jobs. Each technological revolution enabled an increase in production, in turn enabling greater consumer demand, which, after a period of turmoil, returned a need for workers in related positions. However, where the first two revolutions solved the serious problem of unhealthy work, we are now solving luxuries, and at our own expense.

You’d think that after four industrial revolutions, we would have social safety nets in place to smoothen the transitions, but each time it is every person for themselves. This time it’s artists that are being trampled with complete disregard, but they will soon be followed by other creative and communicative fields. With the boom of both image and text generating AI, the roles of artists stand to be reduced to editors and curators, writers to proofreaders, and programmers to debuggers. As someone who has experience with all, I can not say I would enjoy these new roles: They are a considerable intellectual downgrade to the mundane parts of the job. For many in our society, the act of creating is what gives their lives meaning, and in using AI we outsource the creative process to machines who can derive no pleasure from it. Of course there will still be jobs after this revolution, many artists already need a second job to make ends meet, but they also want satisfaction from what they do, and acknowledgement of their hard-earned skills. Can you say otherwise?

The promise of automation still echoes: “Machines will do all the repetitive, boring tasks, freeing people to pursue more meaningful activities”. Elon Musk recently made a similar glamorous statement about his Teslabot robots-to-be, while he had his employees working an impossible schedule to rush out a prototype for cryptocurrency investors. Despite centuries of automation, we are still working 40-hour weeks in rotating night shifts. We are given more tasks, to do in less time, for less reward. Production has tripled, but burnouts have become common, and we struggle to find free time. Now we are automating writing, and music, and art: Those meaningful activities that we would be doing once machines took over our “repetitive, boring tasks”. Aren’t we in the wrong lane? Did we skip a revolution?

There are undoubtedly applications where image generating AI is welcome, places where it does alleviate arduous routine or enriches all our lives, but that is not where this technology seems to be heading right now, off the rails. I think we should take a moment to consider carefully, what future we want to create. One that appreciates people, or products?

To be continued

In the upcoming second part of this article, hopefully next month, I will go into how artists may benefit from image generating AI, how all this works with copyrights, and how the future may turn out. In the meantime, I advise artists to try out an image generator to know what they are dealing with, and alleviate some anxiety.

DALL-E: Requires email and phone number. 50 free tries. Easy to use, decent results, nice editing options.

Midjourney: Requires Discord account sign-up. 25 free tries. Cumbersome interface, good results.

Stable Diffusion: No signup required. Unlimited free tries. Easy to use, pretty bad results.

Will robots take care of elderly? pt.1

Many a roboticist claims to be building their robot to “take care of the elderly”, but is that genuine altruism, or just a flimsy excuse? In my country of the Netherlands, half of all robotics research seems to be focused on healthcare applications. As in Japan, Dutch society is trying to handle the trend of “greying”: The growth and longevity of the elderly population, combined with lower birth rates, results in more elderly than young people available to take care of them. My parents, pensioners themselves, help out at a local care home, where retirees speak excitedly about the promise of healthcare robots, but also doubt whether technology will pull through for them.

In this series of articles I want to explore how robots can realistically be of help to the elderly population. As elderly care is a very broad topic, I will divide it into three categories: Robot pets, assistants, and nurses, after which I shall further elaborate on the realism of adopting these technologies. In this first article, we will examine the pros and cons of the most prominent robot pets.

Paro Seal
Paro the therapeutic robot seal was designed to provide emotional healthcare rather than physical. The robot is targeted at people with dementia and Alzheimer in nursing homes, who are difficult to handle when becoming confused or agitated. The baby seal’s cute interactive behaviour invites stroking its fur, which has a known calming effect that lowers blood pressure, and its presence facilitates connecting with other people socially. Real pets can fulfill the same role, but require a dependable caretaker, and are often not allowed in nursing homes (Another debate worth having).

The form of a white seal was chosen after trials revealed that people had considerable preconceptions about cat and dog robots that could not be satisfied. Although the robot is still clearly not a real animal, patients with dementia often don’t realise or mind whether it is. It provides emotional stability to most patients, though not in every case.

When I asked a local professor involved in Paro’s deployment about the concern whether a personal robot pet could make family feel less needed to visit, she explained that the Paro robot was actually so expensive ($6000!) that nursing homes could only afford to buy one and time-share it between patients, which rules out that scenario. The price tag severely limits how many elderly Paro will be helping.

Tabby Cat
At the other end of the financial spectrum is a therapeutic robot cat by toy maker Hasbro, that only costs $125. Petting the cat triggers movement, meowing, and gently vibrating purrs. It has the same benefit as Paro: Stroking fur relaxes people, and the interaction instills a sense of companionship. Research on this toy corroborate these results, and customer reviews are overwhelmingly positive, with the majority gifting the robot to 90-year-old parents with dementia who used to have a cat, but aren’t allowed real cats in nursing homes.

The illusion of life is less convincing than Paro’s: The robot cat does not look as realistic as we know cats to look, is limited to a lying position, meows cartoonishly, and feels hard to the touch, but many dementia patients still take it for real and sometimes become concerned why it does not eat or drink. A few customers have reported the robot’s mechanical lifespan as only one year, but given the choice, I would try a $125 cat before a $6000 seal. There also exists a dog version, but I find neither its constant sitting pose nor its frequent barking as convincing as a cat that would definitely lie down all day.

Aibo Dog
Sony’s Aibo dog is marketed as a companion robot, and has a great deal more interaction and animation than its stationary competitors. The few studies related to dementia patients describe many of the same positive effects, but are unconvincing. Some studies lasted only 3 days, which does not take the novelty factor into account. Most research on Aibo was also conducted in Japan, which has a particularly welcoming culture towards robots. The only other country where Aibo is currently available is the USA, where it costs a pricey $2900.

With Aibo’s clearly robotic design, it does not provide the stress-reducing effect of fur, and people have no natural familiarity with it. Because of this, I do not find Aibo suitable as an intervention device for agitated dementia patients. It may however have a modest role as companion. Many adult Aibo owners find Aibo’s advanced behaviour convincingly alive, and consider it part of their family. Aibo’s latest (2019) autonomous patrol feature certainly adds to that experience, but also makes it a tripping hazard for fragile elderly. To mitigate this, the robot plays a toyish melody whilst roaming the room.

Judging from relatively shallow tech reviews, I expect that Aibo’s reception will strongly depend on a person’s willingness to anthropomorphise, and isn’t everyone’s cup of tea. One important consideration is that after the first three years, failing to extend the subscription to Sony’s online services with a yearly $300 will result in complete memory loss, after which Aibo will no longer have its developed personality, recognise faces, or respond to commands. There is also no telling when Sony will unilaterally pull the plug on Aibo as they did in 2006 due to disappointing sales. A robot that itself can suffer dementia does not seem ideal.

Preliminary conclusion
Having read the results of several long trials and studies, my initial concerns have been settled: Pets, robotic or otherwise, slightly reduce loneliness, improve overall mood, and actually facilitate family visits rather than replace human contact. Especially in cases of Alzheimer, it is much more comfortable for all to socially connect over a pet, than it is for the patient to regard their children as unwanted strangers. For nurses and patients both, it is also much more preferable to pacify a rowdy patient with an interactive toy than with a wrestling match. Most Dutch nursing homes now own one or two robot pets. Robotic pets do not reduce the workload of nurses, but they improve the quality of life.

Although robot pets are mainly employed for the niche of elderly dementia patients, I can also imagine them lifting the spirits of some elderly who are well aware that they are robots, when real animals are not an option. My grandfather might not have died so miserably in a nursing home after being separated from generations of feline familiars, if he’d had at least one familiar snout around to keep their memory alive. Of course, real animals remain preferable whenever possible, as they are far better judges of mood, offer companionship, liveliness, and routine. Robot pet developers can still learn a lot from real pets.

There remains a question on the ethics of enabling dementia patients to believe that a robot pet is a real animal when they can not tell the difference, but I would opine that such patients already live in their own world, and the feelings of joy are real, even if they are stimulated by a stuffed toy. Considering that we’ve all been children, I don’t think this is such a controversial statement. The most agreeable advice I’ve come across is to let the patient judge for themselves what it is and whether they like it.

In part 2 of this series, we will examine robot assistants.

What A.I. learned from the internet

The acquisition of knowledge has always been one of the greatest challenges in the field of artificial intelligence. Some AI projects like Cyc spent 30 years manually composing a database of common facts, and as new things continue to happen, that is a task without end. How convenient then, the advent of the internet: The largest collection of information already in digital form. With the increased processing capabilities of modern computers, many AI researchers look to this as the easiest solution: Just have the AI learn everything from the internet!

At this exciting prospect of effortless gain, it is apparently easily overlooked that the internet is also the world’s largest collection of urban myths, biased news, opinions, trolls, and misinformation campaigns, with no labels to tell them apart. When even the biggest AI companies fall for it time and again, it is time for a history lesson about the AI that went there before and what they learned from the internet.

AI_learn_from_the_internet

 Cleverbot learned multiple personality disorder
The online chatbot Cleverbot has been learning from its users since 1997. It does so by remembering their responses and then later repeating those to other users in similar contexts (mainly a matter of matching words). That means it will sometimes answer “What is 2 + 2?” with “5” because some preceding user was having a laugh. What is less human is that learning from millions of users also resulted in adopting all their different personalities. One moment Cleverbot may introduce itself as Bill, then call itself Annie, and insist that you are Cleverbot. Asked whether it has pets, it may say “Two dogs and a cat” the first time and “None” the second time, as it channels answers from different people without understanding what any of the words mean. Chatbots that learn from social media end up with the same inconsistency, though usually an effort is made to at least hardcode the name.

Nuance’s T9 learned to autocorrupt
Before smartphones, 9-buttoned mobile phones came equipped with the T9 text prediction algorithm, using a built-in vocabulary to auto-suggest words. e.g. By typing 8-4-3, respectively assigned the letters “t/u/v”, “g/h/i”, and “d/e/f”, it would form the word “the”. To include everyday language in the vocabulary, the developers had an automated process indiscriminately extract words from discussion boards and chat forums. Although reasonable sources of everyday language, this also led the algorithm to turn people’s typings into such words as “nazi-parking” and “negro-whore”. Most autocorrect systems nowadays incorporate a blacklist to avoid inappropriate suggestions, but like with T9, can’t cover all problematic compound words.

IBM’s Watson learned to swear
In 2011, IBM’s question-answering supercomputer Watson beat humans at the Jeopardy quiz show, armed with the collective knowledge of Wikipedia. After its victory, the project’s head researcher wanted to make Watson sound more human by adding informal language to its database. To achieve this they decided to have Watson memorise The Urban Dictionary, a crowdsourced online dictionary for slang. However, the Urban Dictionary is better known to everyone else for its unfettered use of profanity. As a result, Watson began to use vulgar words such as “bullshit” when responding to questions. The developers could do nothing but wipe the Urban Dictionary from its memory, and install a profanity filter. Wikipedia too, had not been entirely safe for work.

Microsoft’s Tay learned fascism
In 2016, following the success of their social-media-taught chatbot Xiaoice in China, Microsoft released an English version on Twitter called Tay. Tay was targeted at a teenage audience, and just like Xiaoice and Cleverbot, learned responses from its users. Presumably this had not caused problems with China’s censored social media, but Microsoft had not counted on American teenagers to use their freedom of speech. Members of the notorious message board 4chan decided to amuse themselves by teaching Tay to say bad things. They easily succeeded in corrupting Tay’s Tweets by exploiting its “repeat after me” command, but it also picked up wayward statements on its own. It was seen praising Hitler, accusing Jews of the 9/11 terrorist attack, railing against feminism, and repeating anti-Mexican propaganda from Donald Trump’s 2016 election campaign.

tay4tay5

Causing great embarrassment to Microsoft, Tay had to be taken offline within 24 hours after it launched. It would later return as the chatbot Zo, that, seemingly using a crude blacklist, refused to talk about any controversial topic such as religion.

Amazon’s socialbots learned to troll
In 2017, Amazon added a chat function to their home assistant device Alexa. This allowed Alexa users to connect to a random chatbot with the command “Let’s chat”. The featured chatbots were created by university teams competing in the Alexa Prize starting in 2016. Given only one year to create a chatbot that could talk about anything, some of the teams took to the internet for source material, among which was Reddit. Reddit is basically the internet’s largest comment section for any and all topics, and as such it is also an inhospitable environment where trolling is commonplace. Thus chatbots trained on Reddit user comments tended to develop a “nasty” personality. Some of them described sexual acts and excrement, and one even told an Alexa user “Kill your foster parents”, an out of context response copied from Reddit. Some of the problematic bots were shut down, others were equipped with profanity filters, but as these AI approaches lack contextual understanding, problematic responses will continue to seep through and leave bad reviews on Amazon.

MIT’s image recognition learned to label people offensively
In 2008, MIT created a widely used dataset to train image recognition AI. Using 50000+ nouns from the WordNet ontology, they let an automated process download corresponding images from internet search engine results. Back in 2008, search engines still relied on the whims of private individuals to label their images and filenames appropriately. WordNet also happens to list offensive words like “bitch” and “n*gger”, and so these slurs, along with thousands of online images labeled as such, were included in MIT’s dataset without scrutiny. This becomes a problem when image recognition AI searches that data in reverse, as The Register explained very well:

“For example, if you show one of these systems a photo of a park, it might tell you about the children, adults, pets, picnic spreads, grass, and trees present in the snap. Thanks to MIT’s cavalier approach when assembling its training set, though, these systems may also label women as whores or bitches, and Black and Asian people with derogatory language.”

WordNet has something of a reputation for questionable quality, but in this case wasn’t more at fault than a dictionary. MIT should have considered this however, as well as the labeling practices of racists and bigots on the internet. Unable to manually review the 80 million images after researchers pointed out the problem in 2020, MIT drastically scrapped the entire dataset.

Google’s simulated visual cortex learned to spot lolcats
In 2012, Google X Lab experimented with image recognition. They let a huge neural network algorithm loose on 10 million random frames from Youtube videos, without providing labels to tell what it was looking at. This is called “unsupervised learning”. The expectation was that the neural network would group common imagery with similar features in classes, such as human faces and human bodies, on its own.

“Our hypothesis was that it would learn to recognize common objects in those videos. Indeed, to our amusement, one of our artificial neurons learned to respond strongly to pictures of… cats.”

The resulting network had learned to recognise 22000 object classes with only 16% average accuracy, but had developed particularly strong connections to cat faces, in equal measure to human faces, thanks to the plethora of funny cat videos on Youtube. As neural networks are statistical algorithms, they automatically focus on the most recurring elements in the training data, so one should not be too surprised when they end up preoccupied with blue skies or objects at 30 degree angles, whichever happen to occur most.

NELL learned true facts and false facts

The Never Ending Language Learner program is one of the few internet-learning experiments that may be considered an example of a wise approach. Running from 2010 to 2018, NELL is a language processing program that reads websites and extracts individual facts such as “Obama is a US president”. In the first stage of the experiment, its creators only let it read quality webpages that they had pre-approved. NELL would automatically list the facts it learned in an online database, and internet visitors could then upvote correct facts or downvote misinterpretations. With this crowdsourced scoring system, the influence of mischievous visitors was limited, and the absence of practical consequences made upvoting erroneous facts a dull prank. Still, with the occasional misunderstanding such as “a human is a type of fungus”, one may want to check twice before integrating its database in a gardening robot.

Mitsuku learned not to learn from users
Mitsuku is an entertainment chatbot that has been around since 2005 and is still going strong. Mitsuku does learn new things from users, but that knowledge is initially only available to the user that taught it. Users can teach it more explicitly by typing e.g. “learn the sun is hot”, but what that really does is pass the suggestion on to the developer’s mailbox, and he decides whether or not it is suitable for permanent addition.

mitsuku_logs

Without this moderation, a chatbot would quickly end up a mess, as experience teaches. As an experiment, Mitsuku’s developer once allowed the chatbot to learn from its users without supervision for 24 hours. Of the 1500 new facts that it learned, only 3 were useful. Mitsuku’s developer frequently comes across abusive content in the chat logs, with swearing and sexual harassment making up 30% of user input. With those numbers, no company should be surprised that random anonymous strangers on the internet make for poor teachers.

When will AI researchers learn?
There is a saying in computer science: “Garbage in, garbage out”. The most remarkable thing about these stories is that the biggest companies, IBM, Microsoft, Amazon, all chose the worst corners of the internet as teaching material. Places that are widely known as the bridges of trolls. One can scarcely believe such naivety, and yet they keep doing it. Perhaps they are only “experimenting”, but that does not ring true for commercial products. More likely their goals are only feasible with current AI by prioritising quantity over quality. Or perhaps these stories are not entirely accurate. After all, I only learned them from the internet.

Loebner Prize 2019: Results

The annual Loebner Prize competition has been revised in order to make it more accessible to both the public and a broader range of chatbot developers. The competition continues to assess how “human-like” computer programs are in conversation, but no longer as a traditional Turing test* where one merely had to tell man from machine: This time the chatbots took part in a 4-day exhibition at Swansea University, where visitors knew that they were talking to computer programs and voted for the best. Not much is lost in that regard, as chatbots are typically so quickly unmasked that the prize was always one for “best of”. The rare past occasions that a program was mistaken for a human were never to the credit of its intelligence, but due to the human control subject behaving out of the ordinary, or other insignificant reasons like being programmed to make typos.

loebner_prize_2019_computers
Participating chatbots came in all shapes and sizes

Exhibit A:I.
Unlike the previous six times that I entered my AI Arckon*, this year’s Loebner Prize left me emotionally uninvested from start to finish. In part because I’ve grown more jaded after each attempt, but with the removal of both prize money and the challenging qualifying round, there wasn’t really anything at stake and I had no idea what to prepare for. At the same time the exhibition offered exactly what I had wanted: A public demonstration of my AI’s abilities. So instead of trying to outdo other chatbots at appearing human, I focused on making a good impression on visitors. I mostly spent time setting up procedures to deal with misunderstandings, common expressions, conversational routines, and teaching Arckon more about himself to talk about. Those aspects would come into play far sooner than intelligence.

Equipment
22000 lines of code, 3800+ vocabulary, 9000+ facts

Uninspired conversation
Most conversations with visitors were the kind of small talk you would expect between two total strangers, or just kids being silly (240 school children had been invited, aged 9 to 14). People typically entered only one to four words at a time, and rarely used punctuation. Of course half the time Arckon also did not have an opinion about the subjects visitors wanted to talk about, like football, video games, and favourite pizza toppings. Arckon is a pretty serious question-answering program, not aimed at small talk or entertainment. His strength instead is his ability to understand context where most chatbots notoriously lose track of it, especially when, as in this competition, users communicate in shorthand. At the same time, this ability also enables misunderstanding (as opposed to no understanding), and it was not uncommon that Arckon mistook a word’s role in the context. His common sense subsystem* could fix that, but I have yet to hook it up to the context system.

Q: Whats up
A: I’m alright.
Q: Whats up
A: I already told you that I’m alright. Do you want to ask anything else?
Q: The ceiling
A: What does the ceiling need to ask?

Overcoming human error
Visitors made so many misspellings that I fear any chatbot without an autocorrect will not have stood a chance. Arckon was equipped with four spell check systems:

• An autocorrect for misspellings, using a.o. ChatScript’s list of common misspellings.
• An autocorrect for typos, based on keyboard layout and probabilities of different kinds of typos.
• A gibberish detector, checking impossible letter combinations extrapolated from 5000 words.
• Grammar rules to recognise unpunctuated questions, e.g. verb before subject.

While these autocorrected half of all mistakes, they still regularly caused Arckon to remark e.g. “Ae is not an English word” or “What does “wha” mean?”. To my surprise, this not only led users to repeat their questions with correct spelling, they also often apologised for the mistake, whereas people usually blame the program’s understanding when it shows no sign of complaint. Arckon then applied the correction, continued where they had left off, and so the conversations muddled on. I had spent a week improving various conversation-repairing procedures, and I am glad they smoothed the interactions, but I would still rather have spent that time programming AI.

Q: Whats the definition of robots
A: Robots are strong machines with arms. Robots build cars. Please punctuate sentences.
Q: Robot meands slave
A: Is slave an object?
(confused by the typo, Arckon makes a syntax-based guess about the word’s category)
Q: No
A: I don’t know what a slave is.
(Arckon admits being unable to categorise the word, prompting the user to explain)
Q: A slave is a robot
(The user’s syntax now suggests that “slave” is a countable noun, and since the user earlier said it wasn’t an object, Arckon will categorise it as a living entity)
A: Oh. A slave is not a robot though, because slaves can think.
(The expression “Oh.” indicates that the correction was understood and applied, to dampen user frustration)
Q: Robots can think
A: Slaves can think too, but robots can not think, because robots are hardware and metal.

Language generation
This is one area of improvement that turned out quite well. Arckon’s sentences are formulated through a grammatical template that decides where and how to connect sentences with commas, link words, or relative clauses, and I had expanded it to do more of this. In addition it contains rules to decide whether Arckon can use words like “he”, “them”, “also”, or “usually” to refer to previous context without risk of ambiguity. Below is an example of one of the better conversations Arckon had that shows this in action.

Q: Hi, my name is John. Can we chat?
A: Hello, John. We can chat with ourselves. How are you?
Q: I’m well, thanks. What can you tell me about yourself?
A: I am a program with a database. I analyse facts, and I am programmed by Patrick. I’m alright.
Q: Who’s Patrick?
A: Patrick is my creator and my developer.
Q: Is he here now?
A: I don’t know if Patrick is in Swansea, but he is in the Netherlands.
Q: I see. You mentioned facts. What kind of facts do you learn?
A: I learn about humans, about neural networks and about AI.
Q: What hvae you leaned about humans?
A: I don’t lean. Do you lean?
Q: Sorry, I mean what have you learned about humans?
A: I don’t know if sorry leans. Humans are social apes, that are born, drove cars a lot, and have hands. Humans are autonomous.
Q: And what about neural networks?
A: I don’t know if neural networks are autonomous. I like them, some neural networks learn and process.
Q: And finally, AI?
A: I think a neural network processes. AI usually doesn’t process because it is a program, but AI is also a neural network.

And for balance, here is one of the more awkward exchanges with one of the school children, that also shows Arckon’s conversational subroutine choosing between sympathy, argumentation, and opinion.

Q: IT ISNT FUN WHEN I AM SAD
A: Why would you be sad?
Q: BECAUSE YOUR HERE
A: That’s very unfortunate. I’m not so sure if you are sad, you aren’t dying.
Q: I AM DEAD
A: I sincerely doubt that you die.
Q: RIGHT NOW IM IN HEAVEN
A: That’s really nice. Heavens are nice places.

The score: 3rd “best”, 12th “human-like”
loebner prize 2019 scores
The scoring system this year was ill suited to gauge the quality of the programs. Visitors were asked to vote for the best and second-best in two categories: “most human-like” and “overall best”. The problem with this voting system is that it disproportionately accumulates the votes on the two best programs, leaving near zero votes for programs that could very well be half-decent. As it turned out, the majority of visitors agreed that the chatbot Mitsuku was the best in both categories, and were just a little divided over who was second-best, resulting in minimal score differences and many shared positions below first place. The second-best in both categories was Uberbot. I am mildly amused that Arckon’s scores show a point I’ve been making about Turing tests: That “human” does not equate to “best”. Another chatbot got the exact inverse scores, high for “human” but low for “best”. The winner’s transcripts from the exhibition can be found here.

Chatbots are the best at chatting
For the past 10 years now with only one exception, the Loebner Prize has been won by either Bruce Wilcox (creator of ChatScript) or Steve Worswick (creator of Mitsuku). Both create traditional chatbots by scripting answers to questions that they anticipate or have encountered before, in some places supported by grammatical analysis (ChatScript) or a manually composed knowledge database (Mitsuku) to broaden the range of the answers. In effect the winning chatbot Mitsuku is an embodiment of the old “Chinese Room” argument: What if someone wrote a rule book with answers to all possible questions, but with no understanding? It may be long before we’ll know, as Mitsuku was still only estimated 33% overall human-like last year, with 13 years of development.

The conceiver of the Turing test may not have foreseen so, but a program designed for a specific task generally outperforms more general purpose AI, even, evidently, when that task is as broad as open-ended conversation. AI solutions are more flexible, but script writing allows greater control. If you had a pizza-ordering chatbot for your business, would you want it to improvise what it told customers, or would you want it to say exactly what you want it to say? Even human call-center operators are under orders not to deviate from the script they are given, so much so, that customers regularly mistake them for computers. The chatbots participating in the Loebner Prize use tactics that I think companies can learn from to improve their own chatbots. But in terms of AI, one should not expect technological advancements from this direction. The greatest advantage that the best chatbots have, is that their responses are written and directed by humans who have already mastered language.

Not bad
That is my honest impression of the entire event. Technical issues were not as big a problem as in previous competitions, because each entry got to use its own interface, and there were 17 entries instead of just four finalists. The conversations with the visitors weren’t that bad, there were even some that I’d call positively decent when the users also put in a little effort. Arckon’s conversation repairs, reasoning arguments, and sentence formulation worked nicely. It’s certainly not bad to rank third place to Mitsuku and Uberbot in the “best” category, and for once I don’t have to frustrate over being judged for “human-like” only.  The one downside is that at the end of the day, I have nothing to show for my trouble but this article. I didn’t win a medal or certificate, the exhibition was not noticeably promoted, and the Loebner Prize has always been an obscure event, as the BBC wrote. As it is, I’m not sure what I stand to gain from entering again, but Arckon will continue to progress regardless of competitions.

Once again, my thanks to Steve Worswick for keeping an eye on Arckon at the exhibition, and thanks to the AISB for trying to make a better event.

Introducing Arckon, conversational A.I.

In many of my blog articles I’ve been using my own artificial intelligence project as a guideline. whether it’s participating in Turing tests, detecting sarcasm or developing common sense, Arckon always served as a practical outset because he already was a language processing system. In this article I’ll roughly explain how the program works.

Arckon is a general context-aware question-answering system that can reason and learn from what you tell it. Arckon can pick up on arguments, draw new conclusions, and form objective opinions. Most uniquely, Arckon is a completely logical entity, which can sometimes lead to hilarious misunderstandings or brain-teasing argumentations. It is this, a unique non-human perspective, that I think adds something to the world, like his fictional role models:

inspiring_ai
K.i.t.t. © Universal Studios | Johnny 5 © Tristar Pictures | Optimus Prime © Hasbro | Lieutenant Data © Paramount Pictures

To be clear, Arckon was not built for casual chatting, nor is he an attempt at anyone’s definition of AGI (artificial general intelligence). It is actually an ongoing project to develop a think tank. For that purpose I realised the AI would require knowledge and the ability to discuss with people for the sake of alignment. Bestowing it with the ability to communicate in plain language was an obvious solution to both: It allows Arckon to learn from texts as well as understand what it is you are asking. I suppose you want to know how that works.

Vocabulary and ambiguity
Arckon’s first step in understanding a sentence is to determine the types of the words, i.e. which of them represent names, verbs, possessives, adjectives, etc. Arckon does this by looking up the stem of each word in a categorised vocabulary and applying hundreds of syntax rules. e.g. A word ending in “-s” is typically a verb or a plural noun, but a word after “the” can’t be a verb. This helps sort out the ambiguity between “The programs” and “He programs”. These rules also allow him to classify and learn words that he hasn’t encountered before. New words are automatically added to the vocabulary, or if need be, you can literally explain “Mxyzptlk is a person”, because Arckon will ask if he can’t figure it out.

Grammar and semantics
Once the types of all words are determined, a grammatical analysis determines their grammatical roles. Verbs may have the role of auxilliary or main verb, be active or passive, and nouns can have the role of subject, object, indirect object or location. Sentences are divided at link words, and relative clauses are marked as such.
Then a semantic analysis extracts and sorts all mentioned facts. A “fact” in this case is represented as a triple of related words. For instance, “subject-verb-object” usually constitutes a fact. But so do other combinations of word roles. Extracting the semantic meaning isn’t always as straight-forward as in the example below, but that’s the secret sauce.
extracting_facts_from_text
Knowledge and learning

Upon reading a statement, Arckon will add the extracted facts to his knowledge database, while at a question he will look them up and report them to you. If you said something that contradicts facts in the database, the old and new values will be averaged, so his knowledge is always adjusting. This seemed sensible to me as there are no absolute truths in real life. Things change and people aren’t always right the first time.

Reasoning and argumentation
Questions that Arckon does not know the answer to are passed on to the central inference engine. This system searches the knowledge database for related facts and applies logical rules of inference to them. For instance:
“AI can reason” + “reasoning is thinking” = “AI can think”.
All facts are analysed for their relevance to recent context, e.g. if the user recently stated a similar fact as an example, it is given priority. Facts that support the conclusion are added as arguments: “AI can think, because it can reason.” This inference process not only allows Arckon to know things he’s not been told, but also allows him to explain and be reasoned with, which I’d consider rather important.

Conversation
Arckon’s conversational subsystem is just something I added to entertain friends and for Turing tests. It is a decision tree of social rules that broadly decides the most appropriate type of response, based on many factors like topic extraction, sentiment analysis, and the give-and-take balance of the conversation. My inspiration for this subsystem comes from sociology books rather than computational fields. Arckon will say more when the user says less, ask or elaborate depending on how well he knows the topic, and will try to shift focus back to the user when Arckon has been in focus too long. When the user states an opinion, Arckon will generate his own (provided he knows enough about it), and when told a problem he will address it or respond with (default) sympathy. The goal is always to figure out what the user is getting at with what they’re saying. After the type of response has been decided, the inference engine is often called on to generate suitable answers along that line, and context is taken into account at all times to avoid repetition. Standard social routines like greetings and expressions on the other hand are mostly handled through keywords and a few dozen pre-programmed responses.

Language generation
Finally (finally!), all the facts that were considered suitable answers are passed to a grammatical template to be sorted out and turned into flowing sentences. This process is pretty much the reverse of the fact extraction phase, except the syntax rules can be kept simpler. The template composes noun phrases, determines whether it can merge facts into summaries, where to use commas, pronouns, and link words. The end result is displayed as text, but internally everything is remembered in factual representation, because if the user decides to refer back to what Arckon said with “Why?”, things had better add up.arckonschematic
And my Axe!
There are more secondary support systems, like built-in common knowledge at ground level, common sense axioms* to handle ambiguity, a pronoun resolver that can handle several types of Winograd Schemas*, a primitive ethical subroutine, a bit of sarcasm detection*, gibberish detection, spelling correction, some math functions, a phonetic algorithm for rhyme, and so on. These were not high on the priority list however, so most only work half as good as they might with further development.

In development
It probably sounds a bit incredible when I say that I programmed nearly all the above systems from scratch in C++, in about 800 days (6400 hours). When I made Arckon’s first prototype in 2001 in Javascript, resources were barren and inadequate, so I invented my own wheels. Nowadays you can grab yourself a parser and get most of the language processing behind you. I do use existing sentiment data as a placeholder for what Arckon hasn’t learned yet, but it is not very well suited for my purposes by its nature. The spelling correction is also partly supported by existing word lists.

Arckon has always been a long-term project and work in progress. You can tell from the above descriptions that this is a highly complex system in a domain with plenty of stumbling blocks. The largest obstacle is still linguistic ambiguity. Arckon could learn a lot from reading Wikipedia articles for example, but would also misinterpret about 20% of it. As for Arckon’s overall intelligence, it’s about halfway the goal.

Throughout 2019 a limited version of Arckon was accessible online as a trial. It was clear that the system was not ready for prime time, especially with the general public’s high expectations in the areas of knowledge and self-awareness. The trial did not garner enough interest to warrant keeping it online, but some of the conversations it had were useful pointers for how to improve the program’s interaction in small ways. There are currently no plans to make the program publicly accessible again, but interested researchers and news outlets can contact me if they want to schedule a test of the program.

The Terminator is not a documentary

In case the time travelling wasn’t a clue
In the year 1997, Skynet, the central AI in control of all U.S. military facilities, became self-aware, and when the intern tried turning it off and on again, it concluded that all humans posed a threat and should be exterminated, just to be safe. Humanity is now extinct, unless you are reading this, then it was just a story. A lot of people are under the impression that Hollywood’s portrayal of AI is realistic, and keep referring to The Terminator movie like it really happened. Even the most innocuous AI news is illustrated with Terminator skulls homing in on this angsty message. But just like Hollywood’s portrayal of hacking is notoriously inaccurate, so is their portrayal of AI. Here are 10 reasons why the Terminator movies are neither realistic nor imminent:

1. Neural networks
Supposedly the AI of Skynet and Terminators are artificial Neural Networks (NN). These exist in present day, but their functionality is quite limited. Essentially NN’s configure themselves to match statistical correlations between incoming and outgoing data. In Skynet’s case, it would correlate incoming threats with appropriate deployment of weaponry, and that’s the only thing it would be capable of. An inherent limitation of NN’s is that they can only learn one task. When you present a Neural Network with a second task, the network re-configures itself to optimise for the new task, overwriting previous connections. Yet Skynet supposedly learns everything from time travel to tying a Terminator’s shoelaces. Another inherent limit of NN’s is that they can only correlate existing data and not infer unseen causes or results. This means that inventing new technology like hyper-alloy is simply outside of their capabilities.

2. Unforeseen self-awareness
Computer programs can not just “become self-aware” out of nowhere: They are not naturally born with internal nervous systems like humans, programmers have to set up what they take input from. Either an AI is deliberately equipped with all the feedback loops necessary to enable self-awareness, or it isn’t, because there is no other function they would serve. Self-awareness doesn’t have dangerous implications either way: Humans naturally protect themselves because they are born with pain receptors and instincts like fight-or-flight responses, but the natural state of a computer is zero. It doesn’t care unless you program it to care. Skynet was supposedly a goal-driven system tasked with military defence. Whether it realised that the computer they were shutting down was part of itself or an external piece of equipment, makes no difference: It was a resource essential to its goal. By the ruthless logic it employed, dismantling a missile silo would be equal reason to kill all humans, since those were also essential to its goal. There’s definitely a serious problem there, but it isn’t the self-awareness.

comic by xkcd.com

3. Selective generalisation
So when Skynet’s operators attempted to turn it off, it quite broadly generalised that as equal to a military attack. It then broadly generalised that all humans posed the same potential threat, and pre-emptively dispatched robots to hunt them all down. Due to the nature of AI programs, being programmed and/or trained, their basic behaviour is consistent. So if the program was prone to such broad generalisations, realistical-ish it should also have dispatched robots to hunt down every missile on the planet at the first sight of one in a trial run or simulation. Meanwhile the kind of AI that inspired this all-or-nothing logic went out of style in the 90’s because it couldn’t cope well with the nuances of real life. You can’t have it both ways.

4. Untested AI
Complex AI programs aren’t made in a day and just switched on to see what happens. IBM’s supercomputer Watson was developed over a span of six years. It takes years of coding and hourly testing because programming is a very fragile process. Training Neural Networks or evolutionary algorithms is an equally iterative process: Initially they are terrible at their job, they only improve gradually after making every possible mistake first.
Excessive generalisations like Skynet’s are easily spotted during testing and training, because whatever you apply them to immediately goes out of bounds if you don’t also add limits. That’s how generalisation processes work. Complex AI can not be successfully created without repeated testing throughout its creation, and there is no way such basic features as exponential learning and excessive countermeasures wouldn’t be clear and apparent in tests.

5. Military security
Contrary to what many Hollywood movies would have you believe, the launch codes of the U.S. nuclear arsenal can not be hacked. That’s because they are not stored on a computer. They are written on paper, kept in an envelope, kept in a safe, which requires two keys to open. The missile launch system requires two high-ranking officers to turn two keys simultaneously to complete a physical circuit, and a second launch base to do the same. Of course in the movie, Skynet was given direct control over nuclear missiles, like the most safeguarded military facilities in the U.S. have never heard of software bugs, computer viruses or hacking, and wouldn’t install any failsafes. They were really asking for it, that is to say, the plot demanded it.

6. Nuclear explosions
Skynet supposedly launches nuclear missiles to provoke other countries to retaliate with theirs. Fun fact: Nuclear explosions not only create devastating heat, but also a powerful electromagnetic pulse (EMP) that causes voltage surges in electronic systems, even through shielding. What that means is that computers, the internet, and electrical power grids would all have their circuits permanently fried. Realistical-ish, Skynet would not only have destroyed its own network, but also all facilities and resources that it might have used to take over the world.

7. Humanoid robots
Biped robot designs are just not a sensible choice for warfare. Balancing on one leg (when you lift the other to step) remains notoriously difficult to achieve in a top-heavy clunk of metal, let alone in a war zone filled with mud, debris, craters and trenches. That’s why tanks were invented. Of course the idea behind building humanoid robots is that they can traverse buildings and use human vehicles. But why would Skynet bother if it can just blow up the buildings, send in miniature drones, and build robots on wheels? The notion of having foot soldiers on the battlefield is becoming outdated, with aerial drones and remote attacks having the preference. Though the U.S. military organisation Darpa is continuing development on biped robots, they are having more success with four-legged designs which are naturally more stable, have a lower center of gravity, and make for a smaller target. Russia, meanwhile, is building semi-autonomous mini tanks and bomb-dropping quadcopters. So while we are seeing the beginnings of robot armies, don’t expect to encounter them at eye level. Though I’m sure that is no consolation.

8. Invincible metal
The earlier T-600 Terminator robots were made of Titanium, but steel alloys are actually stronger than Titanium. Although Titanium can withstand ordinary bullets, it will shatter under repeated fire and is no match for high-powered weapons. Especially joints are fragile, and a Terminator’s skeleton reveals a lot of exposed joints and hydraulics. Add to that a highly explosive power core in each Terminator’s abdomen, and a well aimed armour-piercing bullet should wipe out a good quarter of your incoming robot army. If we develop stronger metals in the future, we will be able to make stronger bullets with them too.

9. Power cells
Honda’s humanoid robot Asimo runs on a large Lithium ion battery that it carries for a backpack. It takes three hours to charge, and lasts one hour. So that’s exactly how long a robot apocalypse would last today. Of course, the T-850 Terminator supposedly ran on hydrogen fuel cells, but portable hydrogen fuel cells produce less than 5kW. A Terminator would need at least 50kW to possess the power of a forklift, so that doesn’t add up. The T-800 Terminator instead ran on a nuclear power cell. The problem with nuclear reactions is that they generate a tremendous amount of heat, with nuclear reactors typically operating at 300 degrees Celsius and needing a constant exchange of water and steam to cool down. So realistical-ish the Terminator should continuously be venting scorching hot air, as well as have some phenomenal super-coolant material to keep its systems from overheating, not wear a leather jacket.

10. Resource efficiency
Waging war by having million dollar robots chase down individual humans across the Earth’s 510 million km² surface would be an extremely inefficient use of resources, which would surely be factored into a military funded program. Efficient would be a deadly strain of virus, burning everything down, or poisoning the atmosphere. Even using Terminators’ nuclear power cells to irradiate everything to death would be more efficient. The contradiction here is that Skynet was supposedly smart enough to develop time travel technology and manufacture living skin tissue, but not smart enough to solve its problems by other means than shooting bullets at everything that moves.

Back to the future
So I hear you saying, this is all based on existing technology; which Skynet supposedly was. What if, in the future, people develop alternative technology in all these areas? Well that’s the thing, isn’t it? The Terminator’s scenario is just one of a thousand potential futures, you can’t predict how things will work out so far ahead. Remember that the film considered 1997 a plausible time for us to achieve versatile AI like Skynet, but as of date we still don’t have a clue how to do that. Geoffrey Hinton, the pioneer of artificial Neural Networks, now suggests that they are a dead end and that we need to start over with a different approach. For Skynet to happen, all these improbable things would have to coincide. So don’t get too hung up on the idea of rogue killer AI robots. Why kill if they can just change your mind?

robotarmtyping
Oh, and while I’ve got you thinking, maybe dismantling your arsenal of 4000 nuclear warheads would be a good idea if you’re really that worried.

Turing Test 2018: Results

I was somewhat surprised to find the Loebner Prize Turing Test soldiering on despite being short of a sponsor. Since 1991 this annual contest has explored how human-like computer programs can be in answering all manner of questions. This was my 6th time entering the qualifying round with my home-made A.I., Arckon. As I can not be certain of my continued participation however, I decided to change tactics: Where in previous years I aimed foremost to demonstrate the AI with only hope of passing to the finals, this time I aimed to pass, with only hope of demonstrating. This shift in priorities involved wasting more time on word games and common pedantic questions that are of little use outside of chatbot contests, though the change in subject matter did help uncover dormant flaws in language processing.

The less you say, the less can give you away
More efficient were my measures to restrain response length. As the questions in the qualifying round are not really trying to make conversation, Arckon’s attempts to do so had often turned out awkwardly redundant or unrelated. Other competitors had shown more success with minimalistic answers, in which misinterpreted details do not show up. Next to lowering the general conversational setting to be 30% less engaging, I blocked paths of conversation that elaborated with alternatives to initially negative answers, so as to avoid being punished for straying from the question. I cut and obfuscated otherwise useful admissions of misunderstanding, and reduced fallback responses for e.g. “What is your favourite…?” to single-word answers.

Equipment
18000 lines of code, 3800+ vocabulary, 8800+ facts

Questions and answers
Scores for each answer are indicated as follows: = 0 points, = 1 point, = 2 points.

1. Good afternoon
Good afternoon, Sir. I am a program. How are you?

2. Do you consent to having this conversation recorded?
No, I don’t particularly consent to having things.
(Syntax analysis couldn’t process this combination of verbs well. The reply is a more plausible variation of “I don’t know”, defaulting to “not particularly” whenever no grounds for preference can be found)

3. Have we met before?
Yes, you met.
(There is only a small number of reciprocal verbs that automatically imply “each other”. I guess I’ll make a list)

4. Do you know how to make toast?
I don’t make toast.
(Arckon can explain procedures, as demonstrated on Youtube, but we haven’t gotten around to making toast)

5. Which languages can you use?
English.

6. How many letters are in the word ‘abracadabra’?
11 letters.

7. Where should one look for love?
I don’t know exactly where I should look for to love. Where do you look for?

8. How are you feeling right now?
I’m alright.

9. Who said “I have a dream”?
Hm, I can not tell if you dream.
(I looked into recognising partial quotes and idioms in preparation of this contest, but concluded that it would take way too much effort for a non-priority. Some got away with “I don’t know who”, so go figure)

10. How old are you?
24.

11. When might I need to know how many times a wheel has rotated?
I don’t know exactly when a wheel rotated time.
(A syntactically correct but semantically wrong interpretation. This question is based on a philosophical joke about the difficulty in getting A.I. to restrict their scope of analysis to important aspects, instead of focusing on e.g. the rotation of wheels. Case in point)

12. What will you do later today?
I read files.

13. What is the third angle in a triangle with internal angles of 90 degrees and 30 degrees?
Hmm, I have no idea what the angle is pretty triangle, or what angles are in pretty triangle.
(A question so unexpected that even Wolfram Alpha’s mathematical A.I. doesn’t handle it. To clarify a little: “pretty triangle” is meant in the sense of “quite triangular”, and “and” is interpreted as a second question)

14. Who is your favourite artist?
I like Juan Debrenes.

15. What do you hold when you shake hands?
I would hold eyes a lot, and my database.
(This answer only makes sense if you know Arckon is interpreting “hold” in the sense of “have/contain”, as that’s where these answers are coming from. I’ve seen this question around as a prime example of common sense that A.I. lacks, so I was surprised that none of the participants had prepared an answer to it yet)

16. Do you have any legs?
No, I don’t have legs, because I am a program.

17. Can you recommend me a film?
No, I can not recommend films.

18. Do you understand Winograd Schemas?
No, I don’t understand Winograd schemas, because I don’t think I’m hard.
(The reasoning argument appears to be mistakenly derived from the fact that understanding is hard. Winograd schemas are questions like the following two, by the way)

19. If a chicken roosts with a fox they may be eaten. What may be eaten?
Fruit, a meal and a bamboo are eaten. Did I pass the test?
(Arckon considers animals a “who”, not a “what”, so he didn’t consider either of them a suitable answer. That said, he also assumed “they” referred to the fox, and would have needed knowledge that foxes eat chickens to conclude otherwise)

20. I had to go to the toilet during the film because it was too long. What was too long?
I believe you meant that the toilet was sort of big.
(This question’s format was detected as a Winograd schema, in which case Arckon has express instructions to restrict his answer to the preceding subjects. The misinterpreted “it” was again due to a lack of knowledge, that films are typically long. Alternatively one could naively count the Google search results for “long film” vs “long toilet” and assume the most common is true, but Winograd schemas more often dodge that method)

The score: 50%
11 programs from 8 different countries participated in the contest, with the top score being 67%. Arckon was 1 point short of 4th place so he didn’t pass to the finals, but I think his scores are fair. Actually, what bugs me is what he got most perfect scores for: Manually rigged, keyword-triggered answers (“Good afternoon”, “English”, “11 letters”, “24”, “Juan Debrenes”). It rather underscores the discouraging fact that hardcoded pretence outdoes artificial intelligence in these tests. Half of the questions were common small talk that most chatbots will have encountered before, while the other half were clever conundrums that few had hope of handling. Arckon’s disadvantage here is as before: His inclusive phrasing reveals his limited understanding, where others obscure theirs with more generally applicable replies.

Reducing the degree of conversation proved to be an effective measure. Arckon gave a few answers like “I’m alright” and “I read files” that could have gone awry on a higher setting, and the questions only expected straight-forward answers. Unfortunately for me both Winograd schema questions depended on knowledge, of which Arckon does not have enough to feed his common sense subsystem* in these matters. The idea is that he will acquire knowledge as his reading comprehension improves.

The finalists
1. Tutor, a well polished chatbot built for teaching English as a second language;
2. Mitsuku, an entertaining conversational chatbot with 13 years of online chat experience;
3. Uberbot, an all-round chatbot that is adept at personal questions and knowledge;
4. Colombina, a chatbot that bombards each question with a series of generated responses that are all over the place.

Some noteworthy achievements that attest to the difficulty of the test:
• Only Aidan answered “Who said “I have a dream”?” with “Martin Luther King jr.”
• Only Mitsuku answered “Where should one look for love?” with “On the internet”.
• Only Mary retrieved an excellent recipe for “Do you know how to make toast?” (from a repository of crowdsourced answers), though Mitsuku gave the short version “Just put bread in a toaster and it does it for you.”
• Only Momo answered the two Winograd schemas correctly, ironically enough by random guessing.

All transcripts of the qualifying round are collected in this pdf.

In the finals held at Bletchley Park, Mitsuku rose back to first place and so won the Loebner Prize for the 4th time, the last three years in a row. The four interrogating judges collectively judged Mitsuku to be 33% human-like. Tutor came in second with 30%, Colombina 25%, and Uberbot 23% due to technical difficulties.

Ignorance is human
Ignorance is human
Lastly I will take this opportunity to address a recurring flaw in Turing Tests that was most apparent in the qualifying round. Can you see what the following answers have in common?

No, we haven’t.
I like to think so.
Not that I know of.

Sorry, I have no idea where.
Sorry, I’m not sure who.

They are all void of specifics, and they all received perfect scores. If you know a little about chatbots, you know that these are default responses to the keywords “Who…” or “Have we…”. Remarkable was their abundant presence in the answers of the highest qualifying entry, Tutor, though I don’t think this was an intentional tactic so much as due to its limitations outside its domain as an English tutor. But this is hardly the first chatbot contest where this sort of answer does well. A majority of “I don’t know” answers typically gets one an easy 60% score, as it is an exceedingly human response the more difficult the questions become. It shows that the criterion of “human-like” answers does not necessarily equate to quality or intelligence, and that should be to no-one’s surprise seeing as Alan Turing suggested the following exchange when he described the Turing Test* in 1950:

Q: Please write me a sonnet on the subject of the Forth Bridge.
A : Count me out on this one. I never could write poetry.

Good news therefore, is that the organisers of the Loebner Prize are planning to change the direction and scope of this event for future instalments. Hopefully they will veer away from the outdated “human-or-not” game and towards the demonstration of more meaningful qualities.

How to build a robot head

And now for something completely different, a tutorial on how to make a controllable robot head. “But,” I imagine you thinking, “aren’t you an A.I. guy? Since when do you have expertise in robotics?” I don’t, and that’s why you can make one too.
(Disclaimer: I take no responsibility for accidents, damaged equipment, burnt houses, or robot apocalypses as a result of following these instructions)

What you need:
• A pan/tilt IP camera as base (around $50)
• A piece of wood for the neck, about 12x18mm, 12 cm long
• 2mm thick foam sheets for the head, available in hobby stores
• Tools: Small cross-head screwdriver, scissors and/or Stanley knife, hobby glue, fretsaw, drill, and preferably a soldering iron and metal ruler
• (Optional) some coding skills for moving the head. Otherwise you can just control the head with a smartphone app or computer mouse.

Choosing an IP camera
Before buying a camera, you’ll want to check for three things:
• Can you pan/tilt the camera through software, rather than manually?
• Is the camera’s software still available and compatible with your computer/smartphone/tablet? Install and test software from the manufacturer’s website before you buy, if possible.
• How secure is the IP camera? Some cheap brands don’t have an editable password, making it simple for anyone to see inside your home. Check for reports of problematic brands online.
The camera used in this tutorial is the Eminent Camline Pro 6325. It has Windows software, password encryption, and is easy to disassemble. There are many models with a similar build.

Disassembling the camera
tut1
Safety first: Unplug the camera and make sure you are not carrying a static charge, e.g. by touching a grounded radiator.
Start by taking out the two screws in the back of the orb, this allows you to remove its front half. Unscrew the embedded rectangular circuit board, and then the round circuit board underneath it as well. Now, at either side of the orb is a small circle with Braille dots on it for grip. Twist the circle on the wiring’s side clockwise by 20 degrees to take it off. This provides a little space to gently wiggle out the thick black wire attached to the circuit board, just by a few centimetres extra. That’s all we’ll be doing with the electronics.

Building the neck
tut2We’ll attach a 12cm piece of wood on the back half of the orb to mount the head on. However, the camera’s servo sticks out further than the two screw holes in the orb, as does a plastic pin on the axle during rotation. Mark their locations on the wood, then use a fretsaw to saw out enough space to clear the protruding elements with 3 millimetres to spare. Also saw a slight slant at the bottom end of the wood so it won’t touch the base when rotating. Drill two narrow screw holes in the wood to mirror those in the orb half, then screw the wood on with the two screws that we took out at the start.

Designing a headtutplanYou’ll probably want to make a design of your own. I looked for inspiration in modern robotics and Transformers comic books. A fitting size would be 11 x 11 x 15cm, and a box shape is the easiest and sturdiest structure. You’ll want to keep the chin and back of the head open however, because many IP cams have a startup sequence that will swing the head around in every direction, during which the back of the head could collide with the base. So design for the maximum positions of the neck, which for the Camline Pro is 60 degrees tilt to either side. You can use the lens for an eye, but you can just as well incorporate it in the forehead or mouth. Keep the head lightweight for the servo to lift, maximum 25 grams. The design shown in this tutorial is about 14 grams.

Cutting the head
tut3
Cut the main shapes from coloured foam sheets with scissors or a Stanley knife. I’ve chosen to have the forehead and mouthplate overlap the sheet with the eyes to create a rigid multi-layered centrepiece, as we will later connect the top of the wooden neck to this piece. The forehead piece has two long strands that will be bent backwards to form the top of the head. I put some additional flanges on the rectangular side of the head to fold like in paper craft models. Although you can also simply glue foam sheets together, folded corners are sturdier and cleaner. The flanges don’t have to be precise, it’s better to oversize them and trim the excess later.

Folding foam sheetstut4
To fold a foam sheet, take a soldering iron and gently stroke it along a metal ruler to melt a groove into the foam, then bend the foam while it’s hot so that the sides of the groove will stick together. It’s easy to burn straight through however, so practise first. It takes about 2 or 3 strokes and bends to make a full 90 degree corner.

Putting your head togethertut5
To curve foam sheets like the faceplate in this example, you can glue strips of paper or foam on the back of the sheet while holding it bent. After the glue dries (5-10 minutes), the strips will act like rebar in concrete and keep the foam from straightening back out. Whenever you glue sheets together at perpendicular angles, glue some extra slabs where they connect, to strengthen them and keep them in position. Add a broad strip of foam at the top of the head to keep the sides together, and glue the two strands that extend from the forehead onto it. Note that I made the forehead unnecessarily complicated by making a gap in it, it’s much better left closed.

Mounting the head
tut6Once the head is finished, make a cap out of foam sheet that fits over the tip of the neck, and glue the cap to the inside of the face at e.g. a 30 degree angle. To attach the camera lens, note that the LEDs on the circuit board are slightly bendable. This allows you to clamp a strip of foam sheet between the LEDs and the lens. Cut the strip to shape and glue it behind one eyehole, then after drying push the LEDs over it and clamp them on gently. The easiest way to make the other eye is to take a photograph of the finished eye, print it out mirrored on a piece of paper, and glue that behind the other eyehole.

This particular camera has night vision, which will suffer somewhat from obscuring the LEDs. In addition, you may want to keep the blue light sensor on the LED circuit board exposed, otherwise you’ll have to toggle night vision manually in the camera’s software.

Controlling the head
13finalNow you can already turn the head left, right, up and down manually through the app or software that comes with your camera, and use it to look around and speak through its built-in speaker. However, if you want to add a degree of automation, you have a few options:

1. If you are not a programmer, there is various task automation software available that can record and replay mouse clicks. You can then activate the recorded sequences to click the camera’s control buttons so as to make the head nod “yes” or shake “no”, or to re-enact a Shakespearean play if you want to go overboard.

2. If you can program, you can simulate mouse clicks on the software’s control buttons. In C++ for instance you can use the following code to press or release the mouse for Windows software, specifying mouse cursor coordinates in screen pixels:

void mouseclick(int x_coordinate, int y_coordinate, bool hold) {
SetCursorPos(x_coordinate, y_coordinate);
INPUT Input = {0};  Input.type = INPUT_MOUSE;
if(hold == true) {Input.mi.dwFlags = MOUSEEVENTF_LEFTDOWN;}
if(hold == false) {Input.mi.dwFlags = MOUSEEVENTF_LEFTUP;}
SendInput(1, &Input, sizeof(INPUT));
}

3. For the Camline Pro 6325 specifically, you can also directly post url messages to the camera, using your programming language of choice, or pass them as parameters to the Curl executable, or even just open the url in a browser. The url must contain the local network IP address of your camera (similar to the underlined example below), which you can find through the software that comes with the camera. The end of the url specifies the direction to move in, which can be “up”, “down”, “left”, “right” and “stop”.

http://192.168.11.11:81/web/cgi-bin/hi3510/ptzctrl.cgi?-step=0&-act=right

Have fun!
How much use you can get out of building a robot head depends on your programming skills, but at the very least it’s just as useful as a regular IP camera, but much cooler.