A Vision based Hand Gesture Interface for Controlling VLC Media Player

0 21
Avatar for Mimi55
Written by
2 years ago

Human laptop Interaction will acquire many blessings with the introduction of various natural kinds of device free communication. Gestures are a natural style of actions that we frequently use in our everyday life for interaction, so to use it as a communication medium with computers generates a brand new paradigm of interaction with computers. This paper implements computer vision and gesture recognition techniques and develops a vision primarily based low price device for dominant the VLC player through gestures. VLC application consists of a central machine module which uses the Principal element Analysis for gesture pictures and finds the feature vectors of the gesture and put it aside into a XML file. the popularity of the gesture is completed by K Nearest Neighbour algorithm. The theoretical analysis of the approach shows the way to do recognition in static background. The coaching pictures are created by cropping the hand gesture from static background by police work the hand motion victimization film writer Kanade pointed Optical Flow algorithm. This hand gesture recognition technique won't solely replace the utilization of mouse to manage the VLC player however conjointly give completely different gesture vocabulary which can be helpful in dominant the application. Keywords VLC player, recognition, gesture, human computer interface. 1. INTRODUCTION WIMP (windows, icons, menus, pointers) prototypes, in conjunction with the keyboard and therefore the mouse, are definitive in providing the pliability to be used of computers machine. It provides users a transparent objective model of what task, directions to perform and their potential outcomes. These paradigms allow a user a way of accomplishment and obligation concerning their interaction with computer application. [1]. By the underlying prototype, users specific their significance to the pc user’s victimization their hand to perform button clicks, positioning the mouse and key presses. this can be a rather unnaturally a restrictive approach of interaction with user systems. In our everyday life, computers are comely additional and more pervasive. it's extremely worthy that the interaction with the systems doesn't basically take issue from the natural interaction happening between completely different users. sensory activity User Interfaces (PUI) is that the basis within which they're interested with extending Human laptop Interaction (HCI) to use all modalities of human perception. Early development of PUI, it uses vision-based interfaces which perform on-line hand gesture recognition and conjointly one in all the best approaches. High exactness and speed is that the major blessings of hand gesture. the foremost made tools like mice, joysticks ANd keyboards are capable for HCI, as they need been completely certified. Humans learn simply the way to perform them, accomplish the most diverse and sophisticated tasks. These interfaces supported laptop vision techniques also are modest and economical, creating them perfect. historically HCI uses differing kinds of hardware devices like instrumented gloves, sensors, actuators, accelerometers for desegregation gestures as an interface for interaction. however these devices don't give flexibility for interacting in real time environment. However, in HCI variety of applications involving hand gesture recognition exist. The applications designed for gesture recognition typically needs restricted background, set of gesture command ANd a camera for capturing images. variety of applications related to gesture recognition are designed for presenting, inform, virtual workbenches, VR and so forth Gesture input will be classified into other ways [2]. one in all the categories is deictic gestures that talk to pointing an object or reaching for something. acceptive or refusing an action for an occurrence is termed as mimetic gestures. it's useful for language illustration of gestures. AN picture gesture is far of process an object or its features. Pavlovic et. all [3] concludes during this paper that the gestures perform by the users ought to be logically explicable for planning the human laptop interface, because the cutting come near the domain of computer vision primarily based techniques for gesture recognition isn't {in a|during a|in AN exceedingly|in a very} state of providing a acceptable resolution to the current problem. a significant challenge evolves is that the complexness and lustiness joined with the analysis and analysis for recognition of gestures. completely different researchers have planned different pragmatic techniques for gesture as an input for human computer interfaces. Liu and Sir Bernard Lovell [4], proposed a way for real time trailing of hand capturing gestures through we have a tendency tob|an internet|an online} camera ANd Intel Pentium primarily based notebook computer with none use of refined image process techniques and hardware. during this paper we presents an application that is intended for human computer interaction which uses completely different computer vision techniques for recognizing hand gestures for dominant the VLC media player. The aim and objectives of this application is to use a natural device free interface, which acknowledges the hand gestures as commands. the appliance uses a digital camera which is employed for image acquisition. to manage VLC media player victimization defined gesture, the appliance focuses on some operate of VLC that are used additional frequently. Figure one shows the outlined function. the remainder of paper is organized beneath following sections: design style of the application is shown in section 2.Section three covers summary of laptop vision techniques utilized in the application. Section four shows the methodology designed for the application. Application results are highlighted in section 5. Section six shows the testing and analysis of the. Conclusion in section seven with future add section eight is discussed. References utilized by the application are shown in section 9. Figure 1. Gesture outlined functions. a pair of. design style the appliance uses a hybrid approach for hand gesture recognition. It acknowledges static hand gestures. Figure 2 shows the architecture design of VLC management player. pictures are captured from camera and more established following sections/algorithms: creating of economical coaching Image: Aim of this phase is to extend data of the item of interest (gestures) in captured images by following steps: Detection of hand from streaming video by victimization film writer Kanade pointed Optical Flow [5], [6] rule. It detects moving points (hand) in image. It passes the higher than moving points to K-MEAN [7], [8] algorithm to search out center of motion that is such as the middle of moving hand. Generate a parallelogram around this motion center and crop the region inside this rectangle. once cropping save image to a selected location for learning or directly use for recognition. Learning Phase: once obtaining economical pictures from higher than operations these are used for training. Principle element Analysis [9], [10] rule is employed for training. this offers a feature of images that is saved in an exceedingly XML File. Figure 2. design design. Recognition Phase: Taking efficient images from a camera and spending on to K-Nearest Neighbourhood [11] for matching with previous keep gesture database. VLC Interaction: currently in keeping with recognized gesture causation a pre-defined command to VLC to perform applicable action. 3. laptop VISION TECHNIQUES Tools and Techniques for VLC application: the appliance uses a hybrid approach for hand gesture recognition that acknowledges static hand gestures. the pictures are captured from camera so passed to completely different algorithms for learning and recognition. the pc vision techniques used for the application are mentioned below: Pyramid Lucas-Kanade Optical Flow: this can be used for recognition of gestures. Hand detection is completed victimization 2 techniques i.e. colouring and motion trailing. Motion tracking is done using Lucas-Kanade Optical Flow algorithm. Figure three shows the optical flow field generated by optical algorithm. Place the calculated mean values into AN empirical mean vector u of dimensions M × 1. hard the deviations from the mean Store mean-subtracted information within the M × N matrix B. Figure 3. Optical Flow Field generated by Optical Flow Algorithm. K-Mean Algorithm: Optical flow generates a vector of moving point. These moving points are organized in clusters for any process like cropping, resizing and so forth K-Mean [7] is employed for clustering. Its method will be outlined as if there are N given purposes, wherever every point may be a d-dimensional , then k-means agglomeration partitions the N points into k sets (k < n) S= thus on scale back the within-cluster add of squares: where µi is mean of Si during this application the input for the rule is that the x, y co- ordinate of the points generated by optical flow. There are 2 vectors x1 and y1. x1 has the x co-ordinate of points and y1 has y co-ordinate. At the output K-Mean returns the cluster center that's used for clipping and so forth Figure four shows the generated cluster. Figure 4. Generated cluster Principal element Analysis: This rule is employed for extracting common options of all pictures and any reducing its dimension. Following are the steps concerned in PCA technique: hard the empirical mean [9] notice the empirical mean on every dimension m = 1... M. wherever h is scalar matrix Finding the variance matrix notice the eigenvectors and eigenvalues of the covariance matrix cypher the matrix V of eigenvectors that translate the covariance matrix C: wherever D is that the square matrix of eigenvalues of C. type the columns of the eigenvector matrix V and eigenvalue matrix D so as of decreasing eigenvalue. take away the smaller eigenvalue. The input for the rule is matrix of image size (M*N) containing data concerning every pel of image. The output matrix is of common options with reduced dimensions. These features are saved in AN XML file. K-Nearest Neighbourhood: This rule is employed for recognition that takes the input image and acknowledges the category from which it belongs. The K-NN algorithm will summarized as follows: K-nearest neighbors algorithm (k-NN) may be a technique for classifying objects which is predicated on choosing nearest coaching examples within the featured house [11]. AN discretional 0instance is delineated by (a1(x), a2(x), a3(x),.., an(x)) ai(x) denotes options geometer distance between 2 instances d (xi, xj)=sqrt (sum for r=1 to n (ar(xi) - ar(xj))2) The input used for the rule is within the matrix style of the eigenvalues. once an input frame passes it calculate eigenvalues of this image and pass it to algorithm as an input parameter. within the output the operate returns an whole number price that indicates from which gesture the image is matching. Figure five shows the Communication between coaching and testing phase. generates additional data concerning the image. K-Mean rule is employed for this purpose. Cropping of Images: The clusters are cropped and keep in an exceedingly completely different image. The cropped image moves the background, noise etc that generates additional proportion of knowledge than previous image. initial making a parallelogram round the clusture and clipping that rectangle to a brand new image. Saving Images: once cropping the image is saved for learning process. throughout learning it reads all saved image and apply algorithms. Learning Segment: Learning section is split into 2 parts: Extracting Features: the appliance uses fifteen positive images for every gesture used. All the pictures are loaded from their corresponding address wherever the PCA is applied for extracting features.Figure six shows the input image to be train. Figure 5. Communication between coaching and testing phase. KNN lies in between each phases 4. APPLICATION METHODOLOGY The methodology used for the application as follows: Hand Segment: The task is to convert camera input frame into a picture that has additional information. Following are the steps: Changing Image to grey Scale: during this image is initial reworked to gray scale color. In optical flow carries with it 3 assumptions, {one of|one among|one in an exceedinglyll|one amongst|one in every of} that is Brightness constancy. For maximising constancy image must be reborn to gray scale color. police work Moving Points: For detection of hands typically 2 ways are followed: colouring trailing of moving hand This application uses the second method. The hand is rapt ahead of camera in a non-moving background once the user uses it or at the time of feature learning. film writer Kanade Optical flow technique is applied for trailing the moving points from a streaming video by comparison previous frame with current frame. creating Clusters: once police work moving points that are done through optical flow the points must be clustered which Figure 6. Input Image to be train Saving of Features: we've got save options extracted by PCA. And conjointly save vital information. Recognition Phase: Recognition section is split 2 parts: Loading of XML Document: For recognizing, the application hundreds the xml document. Matching: Matching of input pictures is completed with the loaded xml information to come to a decision that gesture it matches. Matching is done by victimization KNN. VLC Interaction: once recognition section AN whole number value is obtained. the worth of gesture matches with the input gesture. any generation of a equivalent keyboard event [8] of the hotkey that's predefined to perform user’s meant action. process Own Gestures: the appliance provides flexibility to users for outlining his gestures for dominant VLC functions. The figure seven shows AN interface the training method within the following steps: Figure 7. Interface for outlining own gestures Step 1: initial click on Open Camera button so choose that gesture desires to redefine and then click on begin capture. Figure eight shows whereas defining own gestures select gesture and click on start capture. Figure 8. whereas defining own gestures select gesture and click start capture. Step 2: once fifteen pictures a pop-up message can seem with message that fifteen pictures captured. at that time captured images are displayed by clicking on “Show Captured Imaged” as shown in figure 9. Figure 9. once capturing 15 images you'll see all by clicking “Show Captured Images” Step 3: within the last step for learning process click on “learn”. Once the method will complete a message box will pop with message that Learning Complete. 5. RESULTS Following figures shows the results obtained of various gestures accustomed management the VLC player. 6. TESTING AND ANALYSIS Testing of Learning Phase: For increasing potency the appliance captures fifteen pictures of every gesture. At the time of recognition all gestures are recognized with less robustness. This shows that the options of all gestures are gift in XML file. This makes the training section with success take a look ated. Testing of Recognition Phase: Table one shows the hand gesture recognition results obtained from the test images keep within the database. Table 1. Hand Gesture Recognition Results Gesture No. of images keep No. of hits No of misses Recognition rate (%) Play/Pause fifteen 15 zero a hundred Full Screen 15 15 0 100 Increase Vol. 15 eleven four 73.33 Decrease Vol. 15 fourteen 1 93.33 Stop 15 thirteen a pair of 86.67 Figure 15 compares the popularity rate of the gestures recognized with the pictures saved supported the quantity of hits and misses of various gesture commands used for dominant the VLC application. thanks to the shouting background and gesture shapes performance of some gesture decreases. For increasing performance of the appliance more variety of take a look at pictures must be keep and taking call for functions of VLC in keeping with liquid ecstasy recognized gesture. Comparison of various gestures recognition rate. 7. CONCLUSION In current world several facilities are offered for providing input to any application some wants physical bit and a few while not victimization physical touch (speech, hand gesture etc.).But not many applications are available that are controlled using current and sensible facility of providing input which is by hand gesture .By this methodology user will handle application from distance without using keyboard and mouse. This application gives a unique human laptop interface by that a user can management media player (VLC) victimization hand gesture. the appliance outlines some gesture for dominant the functions of VLC player. The user will provide gesture as an input in keeping with interested function. the appliance provides a flexibility of process user interest gestures for specific command which build the application additional helpful for physically challenged people, as they'll define the gesture according to their feasibility. 8. FUTURE WORK this application is a smaller amount sturdy in recognition phase. lustiness of the application can be exaggerated by applying some additional sturdy algorithms to scale back noise and blur motion. For dominant VLC, presently the appliance uses world keyboard route in VLC and creating keyboard event of that global shortcut with keybd_event () function. It’s not the sensible approach of controlling any application. Inter-process communication technique will be applied for this. By applying inter-process communication then VLC can be replaced with alternative application terribly easily.


2
$ 0.00
Avatar for Mimi55
Written by
2 years ago

Comments