A cross-media internet of things system and method thereof are disclosed. The system comprising three types of objects which are cloud server, demo-group, and mobile device, wherein the demo-group further comprises plural of objects which are artworks, media or subject in the scope of activity, and each object is connected with cloud server through a network link, which enables the information of each object being stored on the cloud server, and the cloud server performs the dynamic interactive demonstration by controlling the object’s action, sound, light, animation, machine, electronic, and electrical changes. Moreover, each object of the demo-group configures a sensing interface, and an activity software is installed on the mobile device, as the visitor hold the mobile device to approach the sensing interface of one of the objects, the activity software will connect to the cloud server to obtain the information of the object, at this moment, if the visitor controls the mobile device by a gesture, the activity software will upload the track of gesture motion to the cloud server, and the cloud server will recognize the gesture motion and exchanged the information with the object which corresponding with the mobile device or control the object to make the demonstration. |