My.ADVISOR.com Sign-In
ID
Password

Member Center / Sign-Up
   
SUBSCRIPTION STATUS
If you are a subscriber to this publication, sign-in to access locked articles. To subscribe or renew go to www.AdvisorStore.com.
Go to Article
Advanced Search 

TECHNIQUE

Create Features from Text

Use parsing routines to add new features to your projects.

 DOWNLOAD (24,892 bytes) -- The code used in this article.
By John Mark Osborne

I almost titled this third article in the Philosophy of FileMaker series "Parsing Text," but manipulating text is much more than cleaning up data. It lets you create features not possible with a single FileMaker Pro tool. The power of FileMaker Pro exists in combining tools, and the Philosophy of FileMaker teaches you how to use these tools to create your unique solution. As you read this article, don't memorize the techniques, but try to understand what they do.

The basics

Before diving into feature creation, let's discuss the foundation of text parsing. When I started writing this article, I decided to look up the word parse. Several dictionaries later, I finally found a definition appropriate to the computer sciences: "to analyze or separate (input, for example) into more easily processed components." For the database developer, parse means to separate first and last names into different fields. That's just where this journey begins. Before you create features with text parsing, there are some foundational concepts to understand.

Let's say you import contact information into a FileMaker Pro database. Unfortunately, all the first and last names are contained in a single field, making sorting by last name impossible. For the sake of a simple example, all the names contain a single word for the first name and a single word for the last name. There are more sophisticated formulas for parsing multiple word last names, middle names, titles, and more, but that isn't the focus of this article.

The best way to study a text parsing routine is to use example data. Let's use my name in a field called Name:

Name = John Osborne

The simplest way to parse my name into separate fields is to use the Left and Right functions:

Left(Name, 4)

Right(Name, 7)

The Left function grabs everything from the first to fourth character while the Right function grabs seven characters starting from the end. The result for the first formula is "John" and the second is "Osborne". But the formulas can't handle first names less or greater than four characters, or last names less or greater than seven characters. Luckily, there are Word functions in FileMaker Pro 3.0 and later. So, a more dynamic solution is:

LeftWords(Name, 1)

RightWords(Name, 1)

Word functions, such as LeftWords, MiddleWords, and RightWords, use an algorithm to determine where one word ends and another begins, providing a dynamic text parsing formula. The essence of the algorithm is that spaces are word separators, but it's much more complicated. For instance, spaces aren't the only characters identified as word separators, and it's your job to know those characters. I'm not saying memorize the word separator character set, but know how to test data against the Word functions. Simply place your formula in a Define Fields calculation and start entering sample data. Test as many variables as possible within reasonable expectations of the type of data that will be entered. For example, try entering:

204.182.23.89

Osborne, John

jmo@filemakerpros.com

Punctuation by itself isn't a word separator, but punctuation followed by a space is. Many characters are word separators even if they aren't followed by a space, such as an "at" sign (@) in an e-mail address. Remember, the Word function algorithm may not react exactly as you expect, so test it with data you expect the user to enter.

After you've tested your data, transfer your formula from a Define Fields calculation to a scripted calculation, such as a Set Field or Replace step. Why? Because this formula doesn't have to recalculate each time the Name field is updated. After you've parsed the first and last name data into separate fields, you don't need the Name field. A scripted calculation is better, because it parses only when initiated and lets users modify the data in the first and last name fields. One of the biggest mistakes inexperienced developers make is using all their formulas in Define fields. Pick the right place to use formulas based on how you'll use the data.

Before FileMaker Pro 3.0

In the old days, FileMaker programmers didn't have Word functions. To parse a first and last name, you had to combine the Left and Right functions with the Position function. But these formulas still serve an important role. The Word functions are great if the algorithm matches your needs, but more complex parsing solutions often require a nested function incorporating the Position function. Let's examine a better way of parsing text with the same name example:

Left(Name, Position(Name, " ", 1, 1) - 1)
Right(Name, Length(Name) - Position(Name, " ", 1, 1))

This produces the same results as the LeftWords and RightWords functions. Why would you write a more complicated formula? The simple answer is control. The Position function lets you specify any search character or string. It lets you design the algorithm rather than leaving yourself at the mercy of the Word function.

Let's examine these two formulas more closely. Any time you want to study a nested function, work from the inside out. Starting with the first example, isolate the Position function nested inside the Left function. The Position function returns a number value representing the character position of a search value in a string and has four parameters. The first parameter is the text you are searching and can be a string in quotes, a field reference, or another function. In this example, the Name field is the text being searched. The second parameter is the search string and can also be a string, a field, or a function. In this example, there's a space between quotes to locate the word separator between the first and last name. The third parameter is the character position to start looking and is generally a value of "1" so the Position functions start at the beginning of the text. The last parameter is the occurrence and is almost always a "1" because the first occurrence of the search string is typically the target.

It helps to substitute data for field references when working with a nested calculation to understand how it works:

Left(Name, Position("John Osborne", " ", 1, 1) - 1)

Because the space is located at the fifth character, the Position function returns a five:

Left(Name, 5 - 1)

Subtract one from five and you have the Left(Name, 4) formula discussed earlier, except the number of characters returned by the Position function changes depending on the length of the first name. Even better, you can control the search string by modifying this formula for any text parsing job.

Don't forget the slightly more complicated function for grabbing the last name. The Right function works differently than other text functions in that it starts from the right side of the text. Because the Position function starts from the left side, the only way to marry the two formulas is to figure out the length of the last name. You do so by subtracting the beginning point from the ending point. Because the last name is the last word in the field, the Length function is able to provide the ending point. Now, all you have to do is subtract the Length from the position of the space to get the number of characters in the last name.

Memorize these fundamental text parsing formulas, because they serve as the foundation for almost all text parsing tasks. The Philosophy of FileMaker usually doesn't recommend memorizing information, but it's okay in this case because these formulas are the building blocks for all parsing tasks.

New request made easier

Now to create a feature using text parsing. There are many examples, but I chose one based on a common developer mistake. Many developers offer a single find layout for all a user's searching needs. The layout comprises every field a user might require to search, rather than multiple layouts customized for each search task. For example, through interviews with a group of users, a developer might discover two predominant search tasks. Rather than creating one giant find layout, create a find layout for each of these tasks. This makes the interface easier to understand and the users happier.

Along the same lines is trying to teach casual FileMaker users how to create new find requests. If users have difficulty distinguishing the difference between browse and find mode, what do you think will happen with new requests? Wouldn't it be better to provide a simple request interface and create the new requests for the user? In the following example, the new request interface is a single field to show you the concept without bells and whistles. After you understand it, you can apply it to more complicated find tasks.

Let's start with the specialized find layout. Instead of entering find mode to perform the find, show the layout for creating new requests in browse mode. Rather than regular fields, use global fields for find criteria entry. Again, this example is simplified, so there's only one global field. To make data entry as simple as possible, format the global field as a check box with all the possible find requests.

The real magic comes when you create the script. Take a look at the script in listing 1, then I'll walk through it.

Listing 1: New request made easier -- This script parses a formatted check box field into new find requests so the user doesn't have to know FileMaker Pro.

The first two steps are standard and should look familiar. Starting with the third step, Set Field initializes the global field where new request data is entered. Do this before displaying the special find layout to prevent the user from seeing previous find criteria. The script pauses so the user can enter his find criteria. When the user continues the script, it enters find mode.
The key is the Set Field step at the beginning of the loop. It transfers values in the global check box field to new requests in find mode. Here's the formula with returns between each parameter so you can see where one ends and another begins:

Middle(

"¶" & x.find & "¶",

Position("¶" & x.find & "¶", "¶", 1, Status(CurrentRequestCount)),

Position("¶" & x.find & "¶", "¶", 1, Status(CurrentRequestCount) + 1) -
Position("¶" & x.find & "¶", "¶", 1, Status(CurrentRequestCount))

)

This formula grabs each value from the check box. When a user selects multiple values, or other value list formatted field, the values are stored as a return-separated list. If you select red, green, and blue, the values are stored like this:

red¶
green¶
blue

To decipher a complicated formula, work from the inside out. Let's look at each parameter of the Middle function. The Middle function is similar to the Left and Right functions except it extracts text from the middle of a piece of text. It's comprised of three parameters. The first is the text being parsed: In this example, a concatenation of text and a field. The reference to the x.find field is the global field formatted as a check box where the user inputs his new request choices. But why are the returns added to the beginning and end of the x.find field? The reason becomes clear if you consider a different example, such as a return-separated list of numbers:

21¶

6

If you search for "1" in this list, two occurrences are located because "1" also exists at the end of the number 21. If you place returns at the beginning and end of the list, you can search for a more unique value:

21¶



You can now search for "¶1¶" and locate a single occurrence of "1". While this scenario is less likely to occur with text values, it's possible. Get in the habit of plugging up this programming hole or it may haunt you later.

The second parameter of the Middle function tells FileMaker at which character to start extracting text. Instead of a static value, the starting point is the result of a Position function. The parameters are similar to the foundational example, except for the search string and the occurrence parameter. It makes sense to search for a return character, because it surrounds the values you want to extract. But what's up with the Status(CurrentRequestCount) function? It returns a number that corresponds to the current number of requests in find mode. It changes as the loop creates each request to update the calculation to get the next value in the return-separated list.

If you understand how the starting point is located, the formula in the third parameter is similar. The third parameter of the Middle function asks for the number of characters you want to grab, starting from the value in the second parameter. To get the number of characters or distance, subtract the end point from the starting point (the formula would be a lot easier to understand if the Middle function wanted the end point). The formula for the starting point is easy because it's the same as the formula in the second parameter. The ending point formula isn't much different, because you want to locate the following return in the list. To modify the Position formula to locate the next return, just add "1" to the Status(CurrentRequestCount). This increments the current request count by one and locates the return at the end of the value you're trying to extract.

To prevent an endless loop, offer a step for exiting. Only seven script steps can exit a loop: Halt Script, Exit Script, Quit Application, Close, Exit Loop If, Go to Record/Request/Page [Exit after Last, Next], and Go to Portal Row [Exit after Last, Next]. You might consider the Go to Record/Request/Page step, but new requests are being made to accommodate the number of values in the return-separated list rather than looping through existing requests. That leaves one other viable option: Exit Loop If. But what condition will exit the loop at the right time? You just have to count the number of return characters in the list and compare it to the number of requests. When the number of requests equals the number of returns, all the values in the list have been parsed into new requests.

After the loop is complete and all the values have been transferred to new requests, you can initiate the find. The rest is fairly standard scripting, merely providing error checking in case no records are found or no find criteria is entered. If the error checking passes, the user returns to the layout where he started to view his search results.

Get parsing

If you implement this technique, you'll probably use a more complicated combination of values to create your new requests. While some find criteria may differ from request to request, others may stay the same on each request, as was the case in a solution I created for a LASIK eye surgeon. My client wanted to provide a search screen for locating all possible surgery types without having to teach other surgeons how to create new find requests. Some data, such as age range, had to be the same on every find request. The solution is simple: Create the first find request before entering the loop and after entering the loop, duplicate the find request instead of creating a new one, and only modify the values that have to change.

You can create many other features using parsing routines. Many parsing techniques are free to download from my Web site, such as parsing a CGI generated e-mail, extracting
e-mail addresses, removing the current record from a portal based on a self-join relationship, highlighting find criteria in the found set, and more.


John OsborneTechnical editor John Mark Osborne is president and owner of Database Pros, offering FileMaker resources on the Internet. He is author of Scriptology, a speaker at FileMaker Developer Conferences and Macworld conferences, and a trainer for the Professional Training series created by FileMaker, Inc. http://www.databasepros.com jmo@filemakerpros.com

Printer-friendly
page layout

Create Features from Text

No reader comments ... yet.

    What do YOU think about this topic? Share your advice and thoughts using this form.

    Your Name

    REQUIRED : PUBLIC

    Your E-Mail

    REQUIRED : PRIVATE

    Job, Company

    OPTIONAL : PUBLIC

    City, State, Country

    OPTIONAL : PUBLIC

    Your Web Site

    OPTIONAL : PUBLIC

    Your Comment

    Please help everyone by keeping your comments on-topic, using clean language, and not defaming or making personal attacks.


    Your e-mail address is required, but it will not be displayed to the public or given to anyone. See our Privacy Policy. Comments become visible after they pass our spam filter, and spammers and abusers are permanently blocked. Please report spam or abuse.

    ARTICLE INFO

    FREE ACCESS FREE ACCESS


    File: The code used in this article.
    DOWNLOAD: 24,892 bytes

    Keyword Tags: application development, Application Development, collaboration, database, database development, development, Database, Database Development, E-Mail, filemaker, filemaker development, FileMaker, FileMaker Development, FileMaker Pro, messaging, portals, software development

    Use of this or any other site, content, product or service of Advisor Media constitutes acceptance of Terms of Use.
    Portions copyright ©1983-2010 Advisor Media, LLC. All Rights Reserved.
    Reuse or reproduction of any portion or quantity of Advisor Media's copyrighted content, in any form, for any purpose, requires written permission.
    ADVISOR®, the ADVISOR logo, and other names and logos that incorporate ADVISOR are registered trademarks, trademarks or service marks of Advisor Media, LLC in the United States and/or other countries.
    Other trademarks are used for identification, editorial or descriptive purposes and are the property of their owners.
    Hosted by Prominic.NET Website powered by
    LOTUS SOFTWARE
    mff0206 OSBOJ05 posted 2002-5-13 mod 03/12/2010 03:05:57 AM ztfmfd/ztfmfd
    domino-144.advisor.com my.advisor.com 03/12/2010 07:22:13 AM