Broad Network


Metacharacters when Searching within a Site using ECMAScript and MySQL

Search Within a Site using ECMAScript and MySQL – Part 3

Web Development with ECMAScript and MySQL in Node.js

Foreword: In this part of the series, I talk about, Metacharacters when Searching within a Site with ECMAScript and MySQL.

By: Chrysanthus Date Published: 19 Oct 2016

Introduction

This is part 3 of my series, Search Within a Site using ECMAScript and MySQL. In this part of the series, I talk about, Metacharacters when Searching within a Site with ECMAScript and MySQL. You should have read the previous part of the series before reaching here, as this is a continuation.

In the previous part of the series, ECMAScript regular expression and MySQL regular expression were used. Sometimes keywords have metacharacters. A good example of a word with metacharacters is the name of the computer language, C++. It has two plus’s that are metacharacters. A keyword is a normal word, which is important in a passage. So, in some articles, “C++” is a keyword. The question here is, how do you type a metacharacter in a regular expression (regex)? Also, how do you find and replace a metacharacter that you do not know, in a string?

In this tutorial I use the “C++” keyword for illustration. After that I talk about the problem of finding and replacing a metacharacter that you do not know, in a string (word).

Metacharacters
Metacharacters in ECMAScript regex that I know are, { } [ ] ( ) ^ $ . | * + ? \ . Metacharacters in MySQL regex that I know are, ^ $  . * + ? | { } [ ]  : = > < . Do not confuse between white space characters and metacharacters. In ECMAScript white space characters are: ‘\t’, ‘\r’, ‘\n’, and ‘\f’. In MySQL, white space (escape sequences) characters are: '\b', '\t', '\n', '\v', '\f', and '\r'.

How do you type a metacharacter in a regex? Answer: in ECMAScript, you precede the metacharacter with a back slash, as in /C\+\+/. In MySQL you precede the metacharacter with double back slashes as in "C\\+\\+"

Metacharacters and the ECMAScript site Search Engine Script
I talk about the ECMAScript file of the previous part of the series, here. The ECMAScript file uses regular expression technique in two places. In the first place, an operation is used to copy each word of the search phrase to an array. In the second place, ECMAScript “wraps” a MySQL regular expression; this happens at the formation of the MySQL Select Query.

In the first place, the ECMAScript operation is:

                searchStringWordsArr = searchStr.match(/\b\w+\b/g);

In the second place, you have the code segment:

                //form the WHERE clause of SQL select statement
                var numberOfWords = searchStringWordsArr.length; //no. of keywords
                if (numberOfWords > 0)
                 var firstKeyword = searchStringWordsArr[0];
                var whereStr = ` WHERE (series.keywords rLike \"${firstKeyword}\")`;
                var temp;
                if (numberOfWords > 1)
                 {
                 for(k=1; k<numberOfWords; ++k)
                 {
                                temp = searchStringWordsArr[k];
                                whereStr += ` AND (series.keywords rLike \"${temp}\")`;
                 }
                 }

Here, “rLike” is MySQL regular expression operator and the MySQL regex is in the ECMAScript variable, $temp.

Search Phrase Having C++
If the search phrase has “C++” for the above code as given, the “C++” word will not be selected into the array, searchStringWordsArr. This is because + is a metacharacter. To solve the problem, the first place of regular expression above has to be rewritten as:

                searchStringWordsArr = searchStr.match(/\w*c\+\+\w*|\b\w+\b/g);

The regex is now /\w*c\+\+\w*|\b\w+\b/ instead of just, /\b\w+\b/. Note that the + signs have been escaped. So the operation now searches for an ordinary word (\b\w+\b) or C++ (c\+\+) and places in the array.

Unknown Metacharacter in a Keyword
Above, we know that the peculiar word is “C++”, and the metacharacter is, +. In theory you can have situations when you are not sure of the peculiar word and you are not sure of the metacharacter. In this case, the ECMAScript file has to be modified again.

One way to do this is to know the different possible types of peculiar words, and then adjust the file accordingly. Assuming that there are three possible peculiar words, which are, “C++”, “C**”, and “W^^H”, the ECMAScript operation in the first place for regular expression above would be:

            searchStringWordsArr = searchStr.match(/w\^\^h|\c\*\*|w*c\+\+\w*|\b\w+\b/g);

Wow, I find the production of this series exciting; I hope you find it interesting. We have come to the end of this part of the series. We take a break here and continue in the next part.

Chrys

Related Links

Web Development Basics with ECMAScript (JavaScript) and MySQL
ECMAScript (JavaScript) Validation of HTML Form Data
Web Live Text Chart Application using ECMAScript (JavaScript) and MySQL
Page Views with Ajax and ECMAScript (JavaScript) and MySQL
Search Within a Site using ECMAScript and MySQL
More Related Links
Node Mailsend with JavaScript
EMySQL API with JavaScript
Node.js Web Development Course with JavaScript
Major in Website Design
Low Level Programming - Writing ECMAScript (JavaScript) Module
ECMAScript (JavaScript) Course

BACK NEXT

Comments

Become the Writer's Follower
Send the Writer a Message