Part IX: Hardware vs. Software

Posted in assistanttaxonomydataadd-on on AI/ML

[ TBD ]

MCP advertises itself as USB for software
JTAG as another example. Also, let’s talk about how USB and lsusb work. There’s a standard driver inside the system but if you want to specify custom behavior, you need to write your own drivers and hide the details and definitons
Focus on Hardware and how devices can do more for privacy and cost.
Cartridge devices
Self-defining peripherals
MCP running on a device
A2A routes request to our devices
Ultra-smart plugin architecture
An extension that works across software, firmware, hardware, cloud, mobile, etc.
Layered architecture:

Physical Low-level Logging Regulated Firmware Unregulated Firmware System Software Communication Attestation Automation / Agent Authorization Roles Authentication Proxy User (i.e. caregiver / parent / IT) User High-level loggin Events | Actions Ability Groups UI (different types including script) Scripting Simulation

You may think AI Assistants are only made of software. But that is just one manifestation. Assistants can come in the shape of a standalone device, like an Amazon Echo, Google Nest Hub, or Apple HomePod – with the software running inside the device. We call those firmware. An Assistant can also run inside a vehicle, on a television set, or any device with enough capacity and the right input and output peripherals (like wifi, microphone, and a speaker). There have been efforts to place them inside label pins and a small hand-held box. No doubt, there will be other form-factors.

In this section, let’s look at the different modes an Assistant can be packaged, and how we can create an idealized ubiquitous mesh of coverage that can stay with us from place-to-place, while maintaining privacy and context.

I will also present a hardware architecture that is both extensible and adaptive to change. If there’s one thing we can be certain, technology advances and improves. We can’t expect these devices to last forever. The best way to build confidence is to make them painlessly updatable, down to the physical level.

Software Assistant

What I’m calling a >Software Assistant runs on a general-purpose computing device like a laptop, desktop, server, mobile phone, or smart watch. It’s just another application. You can interact with it using whatever peripheral is typically available on that device:

Laptop: trackpad, keyboard, display, microphone, camera, wifi, bluetooth.
Desktop: mouse, keyboard, display, wifi, bluetooth.
Server: networking.
Tablet: touch-screen, microphone, camera, lidar, wifi, bluetooth, cell, nfc, gps, motion-sensor.
Phone: touch-screen, microphone, camera, lidar, wifi, bluetooth, cell, nfc, gps, motion-sensor.
SmartWatch: touch-screen, microphone, speaker, wifi, bluetooth, cell, nfc, gps, motion-sensor.

Of course, you can augment these via wired and wireless peripherals, but I hope you get my point: there are different mechanisms available, depending on the device.

When it comes to AI inference, each of these devices also has to content with variations in storage, memory, GPU, NPU, TPU, and power. Something powered from the wall-socket can handle heavier, more power-hungry loads. A hand-held or wearable device has to worry about how much heat they generate and how quickly the battery runs out. An Assistant that is out of juice halfway through the day is not of much use in the afternoon and evening, and forces the user into unnatural usage patterns and uncomfortable choices.

Hardware Assistant

A Hardware Assistant is usually a standalone device, with processing power, storage, memory. Depending on the category and price, it may have additional prices. I’ll group these into these groups:

Wearable (i.e. the Humane Pin): touch-screen, microphone, speaker, wifi.
Handheld (i.e. the R1 handheld): display, microphone, speaker, wifi.
Embedded Static (i.e. home appliances with built-in Bixby, Echo, and Google Assistant): touchscreen, wifi, bluetooth.
Embedded Moving (i.e. in-vehicle assistants): touchscreen, microphone, speaker, wifi, bluetooth, gps.
Small (i.e. Echo Dot, Google Nest Mini, HomePod): microphone, speaker, buttons, indicator lights, wifi, bluetooth, nfc.
Medium (i.e. Echo Show, Nest Hub): touch-screen, microphone, speaker, camera, buttons, indicator lights, wifi, bluetooth.

These don’t exist yet, but they could:

Mini: onboard a limited resource device, like a smart lightbulb, wall-switch, or network router.
Large: (on-device AI processing – like a souped-up Echo Show) touch-screen, microphone, speaker, camera, lidar, infra-red, mmwave, buttons, indicator lights, wifi, bluetooth, nfc.
X-Large: (lots of on-device horsepower, possibly sitting in a closet): ethernet, wifi, bluetooth, buttons, indicator lights.

There are various other permutations, like wall-socket pods (looking like a wifi range-extender), large/moving versions (running on the on-board driver-assist computers, airplanes, trucks, boats), group-oriented devices (schools, nursing homes, hospitals, offices),

Commonalities

What could tie all these form-factors together?

It has to be a flexible architecture that allows an assistant to adapt itself to whatever mode of input, interaction, processing, and output is available on a specific device.

Also, whether it is designed to run on just a single device, a client-server, or a mesh. The functionality can move across devices fluidly as the user navigates the physical world, going between rooms, visiting neighbors and relatives, walking, driving, catching a bus or ferry, going to a doctor’s office, a school, an office, a bar or restaurant, a sporting arena, concert, or stadium, out for a run, hike, or bike ride.

We want something that can be with us whenever WE need it.

We also want something that can notify us when there’s something we should know about, without overwhelming and annoying us. This one alone has had scant attention paid to it in the AI Assistant world, other than timers or annoying commerce-driven ads and notifications.

Hardware Aceceleration and On-Device computing

There are two races going on right now. The first is to throw as much muscle onto server-level, GPU-class devices, as evidenced by the efforts of GPU-makers like NVidia. The other is to beef up the local processing capabilities and run as much on-device as possible.

The second version has several distinct advantages: it saves on networking, power, and cost. Plus, the whole issue of privacy. By distributing load to the edge, the need for ever beefier cloud-based inference engines diminishes.

Examples:

Integrated:

Apple NPU
Microsoft Windows model
Qualcomm
Samsung

Co-processors:

TPU boards
Other?

Hardware also offers unique capabilities in optimizing the lowest levels of inference. However, there is one downside. On-device inference chips are out-of-date even faster than an opened back of potato-chips. Baking one into a device for its lifetime dooms the user to be behind the curve the minute they purchase a device.

One solution is to embed some of the inference functionality in firmware, which can be field-updated. Another is to offer a hardware trade or recycling service, but that is costly and unwieldy.

The natural solution is… wait, let’s revisit a previous concept first.

Software Extensions

It is not feasible to create a single piece of software that can handle all these possibilities. Instead, we need a core, main app, with the ability to add (and remove) functionality as needed. We want an extensible, trusted system that can maintain our privacy, change when circumstances change (cell network drops, we move from one place to another, something runs low on power, or a device malfunctions).

This means we need a sort of turbo-charged extension mechanism. Something that can pay for itself without forcing the vendor into an untenable one-price-fits-all and inevitable bankruptcy. Or a model where user-data is collected for no other purpose other than monetizing to cover expenses.

We need an Ultra Smart Plugin architecture.

The day of Monolithic, Static Software is over. Any software system that doesn’t allow flexible adaptiveness is doomed to niche status, balkanization, and Proprietary Hell. Rightfully so.

Extensions and upgrades should also be so simple to generate, test, deploy, and update, that they should become everyday operations.

Transparent, reliable, and trusted updates once a day. That’s the goal.

Hardware Extensions

In the mid-1970s, the Altair-8800 was one of the first personal computers to feature the S100 Expansion Bus. This model of providing a common wiring interface allowed computers to be modularized and shipped more quickly. It also gave birth to the concept of third-party hardware suppliers.

The whole thing blew wide open when the IBM PC was launched with an

Many successful companies, notably NVidia and graphic cards, got their start building plug-in boards for more popular computers. There were teething problems when self-installing boards, like misaligned pins, drivers that had to be downloaded and manually installed, and dirt and debris getting inside the enclosure. There have been various approaches to solving these problems.

But the era of monolithic cell-phones and one-piece AI Assistants put a big halt to all that. Maybe it’s time to rethink that approach.

A hardware device is a physical manifestation of a checkpoint in time. That was when product managers, engineers, manufacturers, marketers, and financiers decided those sets of features were good enough to ship.

It’s a physical manifestation of a Monolithic, Static Software package, etched in stone. Yes, you can add wired or wireless peripherals, but the trunk of the system can not be upgraded unless you throw it out and get something else.

What we need is the same concept as the software, but in hardware. Extensible functionality.

Examples:

Modu Project Ara Fairphone (not the same level)

Over-engineering. Just a single block that contains the main processor. Anything else (like screen, built-in sensors, etc.) comes later or gets attached as a peripheral.

The purpose of such a form-factor is to give users comfort that their choice doesn’t lock them in. They can upgrade the device later if they change their mind or hardware advances. Their device can be extended through outside peripherals, or internally via upgrade cartridges.

How these changes are manifested depends on the business-model and the payment structure.

Payment Model

ℹ️ Side Note

This should be a sectino on its own. But I’m not ready to concede that all this should be a book-sized writeup.

Be quiet. I can’t hear you. La la la.

In the beginning, hardware was purchased for a fixed price. Built into this price was the cost of:

Engineering
Prototyping
Testing
Intellectual Property
Regulatory licenses
Parts
Manufacturing and Assembly
Packaging
Transportation
Marketing
Sales
Customer Support
Warranty/Repair

The actual cost is a forecast (aka guess) of level of demand padded with a hefty margin of error.

When connected devices came around, you also had to add the cost of:

Networking hardware
Additional firmware
Physical devices (reset, connect, etc)
Optional: display and buttons to navigate
Mobile apps
Servers
Storage
API services
Web interface
Ongoing firmware development
Firmware update deployment
Additionsl customer support

Some of these are optional, but what they did was to add to the manufacturer’s ongoing expense to operate the system for the duration of its life. This means that the metric now must include an estimated lifetime for a device and the cost of keeping the services running for that duration.

Here you have several choices:

Fold the future cost into a single purchasing price
Upgrade services (i.e. pay to get extra functionality – i.e. Tesla FSD)
Subscriptions (pay as-you-go for the lifetime of the device or service)

The first one would make the up-front cost high, to mitigate future costs. But it could put the purchase out of reach and limit sales.

The second means you are shipping hardware that has all the functionality needed, but you unlock features after the sale. Problem is, the manufacturer is bearing the cost of hardware that may not be used (unlocked) by all users. Most Model-3 and Y Teslas shipped with beefy processing hardware, but only a fraction of users paid the extra fee for unlocking Full Self Driving mode. That’s a lot of cost to eat up-front.

The third one is the one you may see prevalent today. It’s a hard sell, but the one that makes the most sense for manufacturers. The cost of continuing a service at a given scale depends on the amount of ongoing revenue those services bring (plus, maybe a little profit).

An expandable hardware architecture allows a manufacturer to ship a device at cost. They would make a profit on each unit sold and help defray cost of a short trial period. You could add connectivity through an additional for-pay module (to cover the design and production cost), plus an ongoing service subscription to help keep the back-end running.

Additional services may require the same model: purchase a plugin hardware component, add a subscription. If a new service comes around that requires extra processing or a new kind of network or peripheral, the modular cartridge model could allow current users to upgrade and provide additional revenue, without them having to incur the cost of buying and entirely new system.

Hardware/Service hybrid upgrades could become as easy as a software update: new hardware released on a monthly cycle.

ℹ️ Side Note

If you are a hardware person, I realize how crazy this all sounds.

These are aspirational goals, meant to make the process of hardware updates a boring, normal process instead of an annual extravaganza of high-stakes, adrenaline pumping, hair-on-fire affairs.

The software people have gone through this. Annual, high-stakes, do-or-die updates. It is nerve-wracking. Companies that normalized the update process with zero drama have realized the value and increased customer satisfaction.

Creating this process requires discipline and foresight. It requires automated testing, Continuous Integration/Continuous Deployment (CI/CD) systems, and product management that plans for small, incremental updates instead of one-shot massive ones.

Staffing

When building complex, connected hardware systems, you have to coordinate between hardware, firmware, networking, cloud services, back-end applications, APIs, web sites, mobile apps, UI/UX, all manner of protocols, QA at every level, and security roles.

Staffing questions when it comes to building connected hardware include:

Who to hire
What skills are needed
Coordinating design
Protocols
User-Interface designs
User-Experience flows
Internationalization
Design for Manufacturing (DFM)
Testing
Certification
Workflows
Customer Support
Maintenance
Sales and Post-sales engineering

Once the product is shipped, you will also need:

Back-end Ops
Maintenance and monitoring
Technical Support
Upgrade project management, development, testing, and deployment.

Keeping a bird-eye view is a technical architect or product manager who coordinates all the parts to avoid deadlocks, misunderstandings, and keeps the process humming along.

If you get lucky, a lot of these roles are covered by a single individual, but for a large, high-volume product, the sheer volume of what needs to be done is often beyond the capacity of individuals or small teams.

Fortunately, you don’t need all these positions staffed at all times. A good Project Manager will have broken down the stages into phases, and set up who needs to be ready to pick up where and drop-off when.

Protip: avoid the Agile disease when it comes to hardware and firmware. The timeline and complexity of these systems doesn’t fit the cadence of a process that requires you sending out for prototypes, circuit boards, and certifications that take weeks and sometimes months.

Gridlock

Another key issue to consider is that hardware, firmware, back-end, web, and mobile teams often get into a circular gridlock.

Firmware needs hardware to build against, back-end to specify protocols and APIs, and mobile to offer test apps and wireless protocols.

Hardware needs mobile to specify user-interactions, wireless networking, and what to do when things don’t work out. If the hardware has display, cameras, or buttons, it needs to coordinate and sync the state with mobile, web, or back-end.

Back-end needs to coordinate with web and mobile and which APIs to provide.

And so-on and so forth.

The solution is to decouple dependencies. Simulate the back-end APis. Mock hardware using throwaway boards from hobbyist/DIY sites. If you need a custom processor or chip, try making do with an older or alternate model, but leave stubs and configuration settings so you can easily swap them in when they show up.

Decoupling allows teams to continue working without having to depend on another team’s output.

That way lie dragons.

Analytics

Hardware and software analytics should serve a function. Privacy-concerned users are cognizant of devices and apps phoning home. There are governmental privacy rules like European GDPR and U.S.-based CCPA that define what can be collected, what can be stored, and for what purpose.

Consider what the purpose of collecting that information should be. Remember that you can always upgrade an individual device with firmware that collects telemetry, then turn it back off with a simple firmware update. Same with mobile or web software.

I’ve been in meetings where large groups or highly experienced individuals could not answer why data was being collected and what business value it offered.

Everyone Does It is not a good enough reason.

One-time purchase
Time payment / lease
Subscription
Auxiliary fee (i.e. percentage of transaction)
Passive fee (i.e. Ads)
Subsidized (government, insurance, work, club, union)

Future-proofing devices

[ Cartridge or components that are replaceable ]

[ Skins – smart hardware attachments that augment the personality (give voice, etc.) for example, a hat, or pair of eyes that gives it different attributes ]

[ Detachable - take a part away, then bring it back. Remove a cartridge and attach it to your phone, then put it back when done. ]]

Next and Final section: StoryFAQ

Title Photo by Dan Cristian Pădureț on Unsplash

Part IX: Hardware vs. Software

Software Assistant

Hardware Assistant

Commonalities

Hardware Aceceleration and On-Device computing

Software Extensions

Hardware Extensions

Payment Model

Staffing

Gridlock

Analytics

Ramin.Work

Error

Software Assistant

Hardware Assistant

Commonalities

Hardware Aceceleration and On-Device computing

Software Extensions

Hardware Extensions

Payment Model

Staffing

Gridlock

Analytics

Templates (for web app):

Error