A Toolkit for Building Home Networking Applications
Yi-Min Wang |
Wilf Russell |
Anish Arora |
Microsoft Research |
Microsoft Research |
Ohio State Univ. |
Redmond, WA |
Redmond, WA |
Columbus, OH |
|
|
|
Dependability, extensibility, user-friendly interface, and remote access capability are the four key issues.
Figure 1(a) illustrates an ideal Aladdin home networking system where the house is wired for running Ethernet and most smart devices are UPnP network devices that directly plug into the Ethernet and run the device control software on themselves. Network video cameras, door/window/motion sensors, and thermostats are singled out because security systems and temperature control systems are the two most essential and popular home networking applications. A home gateway machine sits between the home network and the external communication infrastructures including the Internet and telephony. User Access Points (UAPs) are wall-mounted or stand-alone flat-panel displays deployed throughout the house to allow convenient access to information anywhere in the house as well as on the Internet. UAPs also expose Web-based, natural language-based, and voice-based interfaces for controlling household devices remotely and for monitoring environmental factors through remote sensors. Network bridges are provided for bridging devices on other communication media to the Ethernet backbone. Such devices do not directly connect to the main network for various reasons including legacy, cost, security, market competition, etc.
In order to investigate systems issues and explore potential killer applications before the emergence of UPnP smart devices, the current Aladdin system uses six Windows 98 PCs (three desktops and three laptops) and their peripherals to serve as both User Access Points and network bridges, as shown in Figure 1(b). The PCs also act as device proxies by running device control software on behalf of the current generation of deivces. The system is deployed in the first author’s three-story house and is being used by the author’s family on a daily basis. The desktops have internal 10Mbps phoneline networking adapters, while the laptops use external 1Mbps adapters [HomePNA98]. Every PC is equipped with a motion sensor and a USB camera; one PC has a thermostat connected to its serial port. All control and sensor signals are routed over the phoneline Ethernet, and bridged to other communication media through the PCs, if necessary. For example, the powerline in the house has serious signal attenuation problem, which prevents the low-cost consumer powerline devices from reliably communicating between any two power outlets. So, for each remote powerline control request, the Aladdin system first routes the request to the PC that is (electrically) closest to the target device and then bridges the command out to the powerline. Similarly, battery-operated RF devices such as motion sensors and door sensors are usually short-ranged to preserve battery power. RF signals transmitted by any of these sensors are received by PCs within its range and bridged to the phoneline network to reach other parts of the house. Since IR (Infrared) signals only work in a single room, they are bridged by the PC locating in the same room. In total, the current Aladdin system consists of n devices.
UAP can be dumb Web pad that simply renders UI, or in a smart environment, it may run learning software, location tracking software, vision-based person identification and gesture recognition, and voice recognition.
(a)
(b)
Figure 1. Distributed System Architecture of the Aladdin Home Networking System. (a) Ideal Future Architecture; (b) Current Architecture (only the peripherals of the Kitchen UAP are shown; all the other UAPs have similar setup).
Figure 2 shows the overall software architecture of the Aladdin system, which manages and provides an abstraction above the hardware devices, and connects them to the external communication infrastructures. We give a brief overview of the three layers in this section and will focus on the system infrastructure components in the remainder of the paper.
1. System infrastructure: At the bottom layer, the system infrastructure consists of five components. The Soft-State Store (SSS) manages the lifetime and the propagation of soft-state variables, where soft-state is defined as volatile or persistent state that would expire and become invalid if not refreshed in time. The Aladdin system makes heavy use of soft-state to achieve robustness, as will be discussed throughout the rest of the paper. Built on top of the Soft-State Store, the Pub/Sub eventing component allows programs to subscribe to events related to changes in the Soft-State Store. The Attribute-Based Lookup Service (ABLS) maintains a database of all available devices and supports queries based on device attributes such as device type, physical location, etc. The Name-Based Lookup Service (NBLS) maintains a table of all running object instances and supports simple name-to-object-address mapping. The Device Announcement Protocol describes how devices connected to other communication media announce their existence to the ABLS. We have built an Aladdin Device Adapter that implements an instance of that protocol for powerline devices. Finally, the system management daemons are responsible for detecting the failures of PCs and devices, and initiating recovery actions.
2. Application layer: There are three types of home networking applications in the current Aladdin system. In the control-type scenario, the applications receive user requests as input, consult the lookup services to identify the devices and device objects that should be involved, and perform actions on them to satisfy the request. Device objects encapsulate device- and network-specific details and present interfaces (sets of method calls) as the programming abstraction for device control. Examples are camera objects for taking snapshots and recording video clips, garage door opener objects for operating the garage doors, etc. Currently, the device objects are based on DCOM, a synchronous object RPC model. As the distributed computing world moves towards asynchronous messaging, Aladdin relies on the Name-Based Lookup Service to provide a smooth migration path by allowing object addresses for multiple programming models to coexist and relying on the client-side library to interpret different object address formats. Also, as the devices become more intelligent and powerful, the device control software is expected to move to the devices themselves.
In the sensing-type scenario, the applications receive a list of interested factors to monitor and the corresponding actions. Through the ABLS, the applications identify the appropriate events to subscribe to. Independently, device daemons act as proxies for sensors by monitoring sensor signals and updating appropriate soft-state variables. For example, a powerline device daemon queries the ABLS for all powerline sensors and their corresponding powerline signals. It then listens on the powerline, looking for matching signals transmitted by the sensors when they fire. Upon detecting a matching signal, the daemon updates the corresponding soft-state variable, which in turn triggers events to the subscribers. A safety application is built this way, which sends an emergency email to ring the homeowner’s cell phone when detecting that any of the critical sensors (water sensors, power outage sensors, temperature sensors, etc.) has fired.
The third type of applications combines both sensing and control. For example, an Aladdin security application plays dog-barking sound and human voice, and records video clips when detecting motions while no one is at home. A temperature daemon automatically changes the thermostat setting from “away” to “at-home” when the homeowner stays at home during a weekday and the motion sensors detect persistent motion. Some commercial products do exist for intelligent sensing and control. But they tend to be closed solutions limited to a single communication medium and are not extensible. Aladdin’s approach of using lookup services to tie together devices connected to any communication medium provides a unique opportunity for constructing innovative applications.
3. User interface: Unlike most other distributed systems, home networking systems are to be used by naïve computer users and so providing friendly user interface is especially important. The current Aladdin system supports three forms of user interface: a browser interface that allows the user to browse through all available devices or select devices based on attributes, and to control devices through point-and-click; a text-based natural language interface based on a limited but customizable vocabulary; and a voice-based interface that employs speech recognition technology based on the same vocabulary. All three forms are available for in-home use. They are also being extended to support remote home automation when the user is away from home. When DSL or cable modem is available, the same browser interface can be used from remote locations. The text-based natural language interface has been extended to an email-based remote home automation interface. The user can send an email containing a control request to an MSN (Microsoft Network) account. The Aladdin email daemon periodically dials up MSN, retrieves and parses the request, performs the actions, and sends a reply email that may optionally contain a snapshot or video clip confirming the actions. In addition to the standard security mechanisms such as digital signatures and data encryption, the home control vocabulary can be customized to provide additional security. Through the text messaging support provided by cell phones, the email daemon can almost synchronously alert the users wherever they are when any of the critical sensors fire at home. Work on extending the voice-based interface to work reliably over telephony is still in progress.
Figure 2. Software Architecture of the Aladdin Home Networking System.
According to an informal survey, the following application scenarios currently supported by Aladdin are considered most useful:
1. The user rushed out for a meeting and forgot to close the garage door. Sitting in the conference room, he can send a secure email to remotely close the garage door. To ensure reliable operation, the garage door is equipped with three inexpensive, redundant sensors: a magnetic sensor that detects that the door is at least two inches open, and two horizontal/vertical sensors that detects that the door is at least 25% and 75% open, respectively. To add additional confidence in this critical operation, the system attaches to the reply email two snapshots of the garage door, one before the operation and one after.
2. Several water sensors are installed in those spots of the house that are more likely to have water leakage or flooding problems. When any of these sensors detects water, it will notify the system, which then fires an event to urge the email daemon to ring the homeowner’s cell phone and describe the problem using the text messaging capability. Other useful application scenarios that can utilize the same system service include temperature sensors for sensing unusually high temperature in the kitchen; motion sensors, door sensors, and safe-box sensors that detects unusual activities when no one is at home or when semi-trusted people (baby-sitters, repairmen, etc.) are inside the house.
3. When the user is out of town, he/she can send an email to request a snapshot or a video clip of all indoor and outdoor cameras to make sure the house is OK.
4. A single voice command or button clicking closes all the curtains in the house.
5. Common temperature control systems only allow one thermostat setting for weekdays and another one for weekends. In a weekday holiday, motion sensors detect persistent motions inside the house and notify the system to automatically override the “away” thermostat setting and maintain a comfortable temperature for the user.
Consider the common requirements for the following scenarios: when a battery-operated garage-door sensor runs out of battery, the system should be able to detect that and alert the user; also, it should discard the previous state of the sensor so that home networking applications do not perform erroneous actions based on stale data. When a device suddenly gets disconnected from the system and when an object gets terminated abruptly, their corresponding entries in the lookup services should eventually expire to allow the system to reclaim the space of those entries and to minimize the chance of client applications getting stale information and potentially malfunctioning. When an essential daemon process fails either due to machine crash or process termination, the system should be able to detect that and either reset the machine or restart the daemon.
To address the above issues, either the system needs to ping the sensor/device/object/daemon or the latter need to send periodic heartbeats to the system. The second solution is the preferred one in the home networking domain for three reasons. First, to reduce the cost, many consumer sensors are transmitters only and do not support polling of their status. Second, many network protocols and even programming paradigms (distributed objects, messaging, etc.) are likely to coexist in home networking systems due to market competition. It will not be practical to require the system to be able to ping all networked entities with various protocols and paradigms. Third, by expanding the heartbeats to contain state information as well, the recovery tasks upon system crashes are greatly simplified. After a crash, the system simply restarts and relies on the heartbeats to reconstruct the state information that the system was maintaining for the network entities and was lost due to the crash.
To simplify the development of home networking applications based on soft-state, we have implemented a Soft-State Store (SSS) that serves as a shared infrastructure for managing and propagating soft-states across different applications and machines. An event system is built on top of the SSS to allow applications to subscribe to events related to data changes in the SSS. An SSS daemon is responsible for maintaining soft-state timers that are used to time out stale data in case of missing refreshes.
Persistence story
Why NBLS high-freq and ABLS low-freq
battery-operated sensors don't fail that often
Why SSS:
SSS in Aladdin: for failure detection
for timing out and reclaiming space of stale lookup service entries
for stopping higher layer when lower layer is stabilizing (do we have examples)
for storing non-pollable sensor states and timing them out when stale to avoid giving out-of-date bad info
Implementation
The Soft-State Store is implemented as a COM EXE server that supports two interfaces. The ISoftStateStoreAdmin interface contains the following methods:
· RegisterSoftStateTypes() and RemoveSoftStateTypes() for defining and removing custom types of soft-state variables, respectively; for example, the CRITICAL_SENSOR type;
· ChangeTimeoutOnVar() and ChangeTimeoutOnSubscription() for manipulating the metadata of soft-state variables and event subscriptions. Specifically, they allow changes to be made to the refresh intervals and the number of missing refreshes before variable expiration.
The second interface, ISoftStateStore, consists of:
· RegisterSoftStateVars() and RemoveSoftStateVars() for adding and removing individual soft-state variables of particular types, respectively;
· SetValue() and GetValue() for setting and retrieving the values of variables, respectively;
· SubscribeEvents() and UnsubscribeEvents() for subscribing and unsubscribing events related to the changes of individual soft-state variables or all variables of certain types, respectively.
The callback addresses supplied as part of the input parameters to SubscribeEvents() are in the form of NBLS unique names, to be described shortly, to allow late binding. When the SSS needs to fire an event to a subscriber, it resolves the name at that time through the NBLS to obtain the subscriber’s up-to-date addresses. This late-binding model is essential for coping with the failure, recovery, and mobility of the subscribers, and is particularly important for subscriptions to rare events. If a subscriber uses the COM programming paradigm, it must support the ISSSNotify interface and announce the (marshaled) interface pointer to NBLS. ISSSNotify consists of four methods:
· Added() is invoked when a new soft-state variable is added to a subscribed type;
· Changed() and Deleted() are invoked when a variable or type is updated and deleted, respectively. A flag in the Deleted() call indicates if the deletion was the result of an explicit removal operation or expiration due to missing refreshes.
· MetaUpdate() notifies the subscriber that metadata associated with the variables or types has been changed.
Usage?
Why soft-state lookup service:
· can handle abrupt leave (another way is to ask the LS to maintain a list of objects/devices to poll: this put state in LS and needs data recovery. With soft state, if the device dies, the entry should be gone, so no recovery is necessary.)
· For low-freq updates, we need persistence anyway and so we can save the list without too much additional trouble. But some devices may not be pollable/pingable.
· ABLS needs to know the (CLSID) type anyway (but it may not know its details (methods, etc.) only clients know)!!, can NBLS just ask ABLS to find out the type and call a standard ping method(!) (Soft-state doesn’t need standard ping method because the timing out mechanism is general.)
3.2.1. Attribute-Based Lookup Service (ABLS)
Physical-location based naming.
Aladdin Device Adapter (ADA)
How ADA and sensor periodic refreshes work
(1) Sensor model:
* Motion sensor: On, still on, not on, off
* Garage door tilt sensor: periodic On/Off
* Smoke alarm: infinite heartbeat interval to short heartbeat interval
Implementation
ABLSLocaleUpdate();
ABLSDeviceRegister();
ABLSModuleRegister();
ABLSLookUp();
ABLSUpdate();
ABLSEnable();
ABLSDisable();
ABLSDelete();
3.2.2. Name-Based Lookup Service (NBLS)
Implementation
NBLSAnnounce();
NBLSRevokeService();
NBLSLookup();
FreeListOfAddresses();
CreateDCOMRefDisplayName();
CreateOLFolderDisplayName();
BindToDisplayName();
FreeDisplayName();
Jini lookup services and leasing
Berkeley soft-state work on Service Discovery Services, BASE, SS-TCP, multimedia archival
Other home networking projects
Why is Aladdin better than point solution:
(1) Eventually cheaper;
(2) Can integrate devices, possibly across multiple communication media
(3) Uniform interface for all devices (natural language, browser-based)
DCOM ping is soft-state
We have described our dependability solutions to the problems we encountered in the deployment of the Aladdin home networking system. The Aladdin Device Adapter was built to allow adding the heartbeat capability to existing dumb devices, which will remain an important part of future home networking systems. The multi-timescale, soft-state-based mechanism was devised to maintain the consistency of lookup services in a heterogeneous and dynamic environment. The idea of powerline activity monitoring, diagnosis, and recovery was proposed to deal with signal interferences, faulty transmitters, etc. Node diagnosis protocols were designed to take advantage of the inherently redundant links and powerline control capability in order to automatically maintain high availability in the absence of system administrators.
In a dynamic environment, it is not unlikely that multiple dependability problems will occur around the same time and trigger concurrent recovery actions. It is therefore important to ensure that these concurrent recovery actions do not interfere with each other, potentially bringing the system into an unrecoverable state. Our current approach is to build a soft-state-based dependability framework that supports both polling and eventing. The framework will be used as a shared infrastructure to simplify the implementations of dependability solutions and to facilitate the proofs of interference freedom among recovery actions for guaranteeing end-to-end system stabilization [Arora95].
Acknowledgement:
Dan Fay
[Arora95] A. Arora and D. Poduska, “A Timing-based Schema for Stabilizing Information Exchange in Networks,” in Proc. Int. Conf. on Computer Networks, 1995.
[Chung98] P. E. Chung et al., “DCOM and CORBA Side by Side, Step by Step, and Layer by Layer,” in C++ Report, Vol. 10, No. 1, pp. 18-29,40, Jan. 1998.
[CM11ASpec] CM11A Programming Specification, ftp://ftp.x10-beta.com/ftp/protocol.txt.
[Evans96] G. Evans, “The CEBus Standard User's Guide”, http://www.cebus.com/training.htm#book, May 1996.
[HF97] G. Hughes-Fenchel, “A Flexible Clustered Approach to High Availability”, in Proc. FTCS, pp.314-318, 1997.
[HomePNA98] The Home Phoneline Networking Alliance, “Simple, High-Speed Ethernet Technology for the Home,” http://www.homepna.org/docs/wp1.pdf, June 1998.
[Huang93] Y. Huang and C. Kintala, “Software Implemented Fault Tolerance: Technologies and Experience,” in Proc. FTCS, pp. 2-9, 1993.
[Jini99] W. K. Edwards, “Core Jini”, Prentice-Hall Inc., 1999.
[Kumar94] S. Kumar and E. H. Spafford, “A Pattern Matching Model for Misuse Intrusion Detection,” In Proc. National Computer Security Conference, pp. 11-21, Oct. 1994.
[Vogels98] W. Vogels et al., “The Design and Architecture of the Microsoft Cluster Service,” in Proc. FTCS, pp. 422-431, 1998.
MIT Oxygen
1. There is a scientific american article on the subject (Aug 1999)
2. The ubiquitous computing effort has components of (a) handheld devices, (b) home networking, and (c) networking protocols.
3. John Guttag, Anand Agarwal, and Victor Zu are involved.
4. http://www.sls.lcs.mit.edu/sls speech component
5. http://sds.lcs.mit.edu/SpectrumWare/home.html software wireless components
6. http://cag.lcs.mit.edu/raw architecture component
http://www.sciam.com/1999/0899issue/0899agarwal.html
GaTech - AwareHome
http://www.cc.gatech.edu/fce/house/
GaTech - SmartFloor
http://www.cc.gatech.edu/fce/smartfloor/
Mike Mozer - Colorado - Adaptive House
http://www.cs.colorado.edu/~mozer/house/
As the success of the Web increasingly brings us towards a fully connected world, home networking systems that connect and manage home appliances become the natural next step to complete the connectivity. Although there has been fast-growing interest in the design of smart appliances, there has been little study on the dependability issues, which is essential to making home networking part of our daily lives. In this paper, we report our experience in the design, implementation, and deployment of the Aladdin home networking system. The heterogeneity of various in-home networks, the undependable nature of consumer devices, and the lack of knowledgeable system administrators in the home environment introduce both opportunities and challenges for dependability research. We describe the overall architecture of the current Aladdin system for supporting remote home automation, the dependability issues that we have observed in the actual deployment, and the distributed system solutions that we have implemented to address those issues.
With the explosive growth of the Web, we are increasingly moving towards a fully connected world. Started as merely an information access mechanism, the Web is now fundamentally changing the way we live by providing multimedia communications and electronic commerce. The success of the Web has demonstrated that the power of being connected can drive innovations and create new applications beyond one’s imagination.
As broadband communication is being brought to the homes with accelerating speed and as small handheld devices get smarter, more popular, and better connected, the notion of being able to communicate with anything at any time from anywhere is bound to become a reality. In this big picture, home networking is a natural next step in which both existing devices and future smart appliances are fully connected inside the house and accessible to the homeowners whenever needed. Just like the evolution of the Web, the most basic home networking applications have emerged first and the power of being connected is now paving the way for unlimited future innovations. Starting from the simple scenarios of sharing files, printers, and Internet connections, home networking is also moving towards enabling multi-player, multi-PC games, digital video and audio anywhere in the house, device automation, remote diagnosis of home appliances, etc. An informal survey shows that different people have dramatically different ideas on what the killer applications for home networking should be, depending on their life styles. It is therefore important to provide an infrastructure for device connectivity to allow the construction of versatile applications on top of the infrastructure.
In the Aladdin project, we focus on providing the system infrastructure for device connectivity by integrating the seven in-home networks into one dependable home network: powerline, phoneline, RF (radio frequency), IR (Infrared), A/V LAN, security, and temperature control. The goal is to allow the users to plug in a device on any of these networks and make it part of the Aladdin system so that it can be used in conjunction with all the other devices to accomplish higher-level system- or user-directed tasks. To make the whole system good enough to live with, one must pay special attention to the dependability issues, including reliability, availability, security, and manageability. The second goal of the Aladdin project is to support dependable remote home automation and sensing. We believe that home networking adds significant value even when people are away from their homes. Therefore, providing reliable and secure remote access to home networks and providing reliable sensing and control of devices are important parts of the project.
From dependability point of view, home networking introduces several new challenges. First, in the consumer electronics market, having volumes in order to drive the price down is a key to success. Manufacturers are therefore led to packaging their products as add-on modules with primitive I/O specifications so that they can be used with a variety of different systems and can be added incrementally to existing systems to control new or existing devices. However, such a design creates dependability problems that must be dealt with by the system. The problems with the powerline-based modules and sensors, which will be described later, are good examples. Second, home networks are heterogeneous and dynamic. Each of the in-home networks has a different characteristic in terms of bandwidth, connectivity, security, interferences, etc. This provides a new opportunity to exploit the redundancy provided by one network to solve dependability problems faced by another network. In addition, compared to the machines in the enterprise environment, consumer devices in the home networking environment are more dynamic in terms of mobility, availability, and extensibility. The system must be able to keep track of all the changes in the entire network in order to support reliable operation. Finally, enterprise environments usually rely on human administrators to perform tasks such as failure diagnosis and recovery, and intrusion detection and defense. But, in the home networking domain, we cannot afford to have the same level of administrator support and so the system must automatically perform those tasks as much as possible. What makes it even more challenging is the fact that consumer devices fail more often and with more different modes, and intrusions and interferences can come from different networks. Building a dependability framework that monitors, diagnoses, and recovers from known dependability problems and allows for extensibility to accommodate new failure modes as they are observed is the centerpiece of the Aladdin project.
UPnP (Universal Plug and Play)
Naming with physical location
Through the ABLS administration console application, the user performs a one-time manual task of assigning a unique address to each power outlet, wall switch, etc. that the user would like to control remotely. The current system uses the X10 powerline-control addresses as the unique addresses [CM11ASpec]. Each of the X10 addresses consists of a house code (A through P) and a unit code (1 through 16). For example, the X10 address “K3” may be assigned to an outlet on the family-room side of the kitchen on the first floor. In the ABLS, the outlet then has a database entry with “X10Addr=K3”, “floor=1”, “room=kitchen”, and “side= family_room”. For fixed devices such as wall switches, the additional “device” attribute can be manually entered to associate it with the physical-location attributes. For moveable devices such as a floor lamp, the “device=lamp” attribute is announced when the lamp is plugged in and switched on, and the registration with the ABLS is performed dynamically by proxy controllers (usually PCs) that receive such announcements. (See the description of the Aladdin Device Adapter for more details.) Smart devices such as PCs register themselves as well as their peripherals at installation time. In contrast, device control object instances are registered with the NBLS only when they are instantiated. Computation objects such as language parsers and speech recognizers are registered in a similar fashion. The interface IDs [Chung98] they support are registered with the ABLS as the attributes of the software modules during the installation process, while the pointers to object instances are registered with the NBLS when objects are instantiated and available to receive requests.
Figure 2 also illustrates the interactions among system components for email-based home automation. At the front end, the email-reading daemon periodically (or when triggered by a phone call) dials up the Internet Service Provider (ISP) to retrieve digitally signed and encrypted emails containing home automation requests. After validating the signature on an email, the daemon passes the request to a natural language parser to convert it into an action and a list of attribute-value pairs. For example, the request “Turn off all second-floor lamps” gets parsed into “action=off” and “device=lamp AND floor=2”. These attribute-value pairs are then submitted to the ABLS to find all matching devices, each of which is identified by a unique name, for example, A2_lamp. To control each matching device, the daemon queries the NBLS with the device’s unique name and gets back a list of addresses that can be used to reach appropriate device control objects, which are either already running or activated on-demand. Upon receiving a call from the daemon, an object sends control signals to the target device and receives feedback signals from the sensor(s), if available, to confirm the action. For example, a user can send an email containing the request “Close the garage door.” The email-reading daemon locates the object in charge of the garage door opener and invokes its Close() method. The garage door has three sensors mounted to provide the information that the door is “at least two inches open”, “at least 25% open”, and “at least 75% open”, respectively. Inside the Close() method, the object monitors the sensor outputs to ensure that the door is indeed closing and closed at the end. To provide an even higher-confidence confirmation, the daemon also locates a camera in the garage to take a snapshot of the garage door before and after the closing action, and sends the two snapshots as attachments in the email reply back to the user.
In addition to the above on-demand control scenarios, event notification is another very useful system functionality. When any sensor detects an anomaly, it notifies the system, which then either generates an audio or video alarm or sends an email to notify the homeowner. The main server PC, which is connected to a low-cost UPS (Uninterruptible Power Supply), is responsible for detecting power outages and sending an emergency notification email.
Note that the various system components in Figure 2 can all be running on different PCs, dynamically chosen based on load distribution and machine availability. It is the responsibility of the system management layer to ensure that all the components are available to carry out the requested actions in spite of machine or network failures.
The Attribute-Based Lookup Service (ABLS) and the Name-Based Lookup Service (NBLS) are two of the most critical components in the Aladdin system. The dependability issues associated with these two lookup services cover two aspects: the reliability and availability of the lookup services themselves, and the reliability and availability for device controls provided through the lookup services. In the Aladdin system, we accommodate not only smart devices that can themselves host device control objects, but also existing dumb devices that must rely on smart proxy devices to run control objects for them. Therefore, we separate the lookup service into ABLS for devices and NBLS for running object instances, and use different storage media for them based on their usage.
In the dynamic home networking environment, it is not uncommon for devices and objects to leave or fail abruptly, without removing their lookup service entries. To maintain reasonable consistency as well as reclaim storage space, both ABLS and NBLS adopt a heartbeat-based soft-state mechanism (analogous to the notion of leasing in other systems [Jini99]), where the term “soft-states” is defined as volatile or nonvolatile states that will expire if not refreshed within a pre-determined, but configurable amount of time. In the NBLS case, running object instances are responsible for sending heartbeats over the phoneline to refresh their own entries. In the ABLS case, an Aladdin Device Adapter is responsible for sending heartbeats over the powerline for the attaching device. As we will demonstrate shortly, the heterogeneous and dynamic nature of home networking introduces the interesting notion of multi-timescale soft-states.
Aladdin is built upon a distributed object system [Chung98]. So the NBLS entries typically hold object interface pointers (in their string, marshaled form). It has been observed, however, that the synchronous, RPC-style communication that distributed objects provide works well only when devices and objects are always available, and may not be sufficient for our dynamic environment. For example, home appliances may be temporarily switched off; PCs may reboot; networks may suffer from transient interferences, etc. We propose the concept of Device Address Book (DAB) to accommodate multiple programming paradigms necessary for a dynamic environment. In the DAB model, each device can be reached by multiple objects and each object can be reached through multiple “addresses”, for example, one for synchronous calls when the object is active and another one for queued, asynchronous calls when the object is temporarily unavailable. This is analogous to the human communication model in which each person can have a phone number, an email address, a Web page URL, etc. in the address book entry, hence the name device address book.
Any object can register multiple addresses in its NBLS entry, identified by a unique NBLS name, and choose a different heartbeat interval for each address. The intervals for synchronous addresses are typically in the order of seconds to tens of seconds to allow fast detection of failed objects. The intervals for queued addresses are usually longer (minutes to tens of minutes) to allow object scheduling and process or system recovery. For example, when its hosting node fails and reboots, an object is destroyed and its synchronous address in NBLS will soon expire. If a client wants to contact the object at that time and a queued call is acceptable, it can do so through the remaining queued address. When the object recovers, it will check its message queue and process the request, much like a person walks into his/her office in the morning and checks voice mail or email.
To support efficient updates, the NBLS is stored in volatile memory. Applications perform NBLS updates and queries by sending their requests to a well-known multicast address. Normally, both a primary NBLS daemon and a backup NBLS daemon running on a different PC listen at that multicast address and process all update requests. But only the primary one responds to queries. When the node diagnosis protocol, to be described later, detects that the node hosting the primary NBLS daemon has failed, it notifies the backup daemon, which then promotes itself to become the primary and starts responding to queries.
The ABLS entries for devices are associated with much longer heartbeat intervals (tens of minutes to hours) because the powerline has a much lower bandwidth and devices do not join and leave very often. When an Aladdin Device Adapter is unplugged abruptly along with its attaching device and unable to send the leave announcement for the device, its missing heartbeats will eventually cause the ABLS entry to expire, reflecting the fact that the device is no longer available. Similarly, consumer sensors provide periodic updates of their states every tens of minutes to hours. Such updates are stored in ABLS as a means to provide the most recent sensor states (because most consumer sensors do not support polling) as well as serve as heartbeats. When a sensor is broken or when its batteries run out, its ABLS entry will eventually expire to indicate that the sensor has left the system. This also prevents future clients from unknowingly accessing stale sensor data.
Because the ABLS entries are refreshed at a much lower rate and because it needs to support more complex queries, we store ABLS in a persistent database to avoid potential volatile data corruption. To provide data availability upon a node failure, every ABLS update is synchronously replicated to a backup database. When the node diagnosis protocol detects a failure of the primary ABLS node, it requests the backup ABLS daemon to start responding to queries. The ABLS daemon announces its existence to the NBLS, just like any other objects and services in the system. But it typically provides only a synchronous address because the ABLS is an essential service that should be always available.
Composite devices:
Example1: Device object for Xcam looks for wireless sender/camera, wireless receiver, and VCR recorder
Example 2: Device object for garage door opener looks for door sensors with 25%, etc. attributes
Do I have time to add InConcert and StarLite client server to NBLS?
When getting understood moniker type, use the right handle to bind and BindTo auto-select unmarshaler
Motivation. A basic reliability problem in using the add-on powerline modules to allow remote control of existing devices is that the on/off status of an add-on module may not be consistent with the on/off status of the device that plugs into it. Specifically, when the power switch on the device itself is turned off, the add-on module still changes its status according to the remote commands and also responds to queries with its own status, which may be inconsistent with the device’s status. One possible solution to this problem is to treat it as an operator error. But this is not adequate for two reasons. First, our experience shows that the notion of disallowing local control in order to achieve reliable remote control is not acceptable to the users. Second, even when the power switch is left on, the inconsistency can still occur when, for example, the bulb is burnt out or the lamp is broken. Therefore, an alternative solution is needed to allow the add-on module to report the true status of the device, and to notify the system when the device is no longer remotely controllable. Note that this reliability problem is not limited to powerline control. It can exist in any communication media that allow the use of add-on modules to provide remote controllability to existing devices.
A secondary issue is the announcement of a device joining the system. As pointed out in Section 2, the joining of fixed devices can be registered manually through the administration console. However, for moveable devices, manually performing de-registration and re-registration every time the devices change locations is not acceptable. It is therefore desirable to have a mechanism for such devices to automatically notify the system of their new locations.
Protocol and implementation. We describe a device announcement protocol and its implementation for powerline devices. Smart devices run this protocol themselves, while existing dumb devices rely on add-on modules to perform this protocol. We call these add-on modules the Aladdin Device Adapters. For X10 devices, we have built such an Adapter that consists of (1) an AC current detector that monitors the real working status of the attaching device by measuring the AC current flowing through the device; (2) a regular X10 module that responds to remote On/Off commands by gating the AC current supplied to the device; (3) a state machine that, based on the status of the current detector and the module, decides when to perform device joining and leaving announcements. An initial investigation of the hardware and software requirements shows that our X10 Adapter can be fabricated with low cost.
Initially, when the device is plugged into the Adapter and the Adapter is plugged into a wall outlet, both the power switch on the device and the X10 module are in the off position. The X10 address of the Adapter is set to that assigned to the outlet. By using the manual override function provided by X10, a user can turn on both the device and the module by simply turning on the power switch on the device. Upon detecting the status change, the state machine sends out a device-joining announcement over the powerline in the form of an extended X10 code [CM11ASpec], containing the following information:
· The X10 address of the outlet that identifies both the powerline address for controlling the device and, through ABLS, the physical location of the device inside the house.
· The device code that is pre-assigned to represent a particular type of devices. For example, we use device code “1” for lamps, “2” for fans, etc. The mapping between device codes and device attributes is stored at the PCs.
· The module code that specifies the valid commands that can be sent to the Adapter to control the device. For example, we use module code “1” for X10 lamp modules that respond to On/Off/Dim commands, and “2” for the appliance module that responds to only On/Off commands, etc.
Due to potential powerline attenuation and interference, the announcement may reach only a subset of the PCs. Upon receiving the announcement, a PC decodes the X10 address, device code, and module code out of the extended X10 code. After verifying that it has access to the software for controlling this type of X10 modules, the PC registers the device with the ABLS. Multiple PCs registering the same device can be sorted by the signal strength of the announcement, which is a measure of reliability for powerline control. Since we currently do not have signal-strength measuring equipments that can be attached to each PC for dynamic measurements, we performed a priori static measurement of point-to-point signal strength from every PC to every outlet and stored it with the ABLS. This measurement is used by the ABLS to specify the preferred controlling PCs for each joining device.
When the power switch on the device is turned off or when the device is broken, the state machine detects that the X10 module is still on but the AC current detector does not detect any current flowing through the device. It concludes that the device is no longer controllable by the Aladdin system and so sends out a device-leaving announcement on behalf of the device, again in the form of an extended X10 code over the powerline. Receiving PCs of this announcement then notify the ABLS of the device’s unavailability to preserve the consistency of the lookup service.
Some of the 2-way modules already send out extended X10 code when plugged into outlet. So ADA should not increase the cost too much.