Finding Common Lines in Two Containers

This lesson demonstrates how to find the common lines in two containers, this is a useful function for comparing lists.

The UI

The UI

The UI for this example consists of two fields, each containing some text, and a button to find the lines that are common between the two fields.

The commonLines function

The commonLines function finds the lines that are in both of two different lists. It does this by creating an array for each list, with the keys being the text of the unique lines, then using the intersect command to find the keys that are in both arrays.

put commonLines(tBaseList, field "New Lines") into tMyLines

Creating arrays from the lists

The first thing the handler does is to create an array of the lines in the pFirstList parameter. You might think that we could use the combine command to do this, but the combine command would make each line into the content of an array element. Instead, we want each line to be the key of an element. The content of each element doesn't matter for this handler.

To create this array, we use the split command. The element's key is the text in the line, and its content is the true constant. If a line appears more than once in the list, the array will contain only one element for that line, instead of one for each repetition, because an array can't have two elements with the same key. But this is fine: since the function is looking for common lines, we don't care whether a line is repeated in the pFirstList, only whether it appears at all.

split pFirstList by cr as set

Next, the handler creates an array from the pSecondList parameter, in the same way as it used the pFirstList. We now have two arrays, and the keys of each array are the lines in the corresponding list.

split pSecondList by cr as set

Finding the common lines

To find the common lines, we need a list of all the elements whose keys appear in both arrays. This is exactly what the intersect command does: given two arrays, it retains only the elements whose keys match.

After the intersect command is executed, the tFirstArray variable contains only the elements whose keys are in both arrays. Each key is the text of a common line, so the function returns the list of keys in this array.

intersect tFirstList with tSecondList
 
## return the corresponding lines of text:
return the keys of tFirstList

This handler takes advantage of LiveCode's ability to create and work with associative arrays. Most languages that support arrays only let you use integers as the keys, but in LiveCode, you can use any string as a key. In this example, the capability lets us use the keys of the array, not just the contents, as meaningful content, and this technique using the intersect command would not be possible without the use of associative arrays.

The commonLines function code

This function belongs on the card script

function commonLines pFirstList, pSecondList
	## create an array for each list
	## the array keys are the lines in the list:
   split pFirstList by cr as set
   split pSecondList by cr as set

	## retain only elements that are found in both arrays:
	intersect pFirstList with pSecondList
  
	## return the corresponding lines of text:
	return the keys of pFirstList
end commonLine

The "Common Lines" button code

on mouseUp
	answer commonLines(field "list1",field "list2")
end mouseUp

A note on efficiency

You might wonder whether the array operations in this example, constructing the two arrays and executing the intersect command, make it slower than an example, like the following, that uses chunk expression to find out whether each line in pFirstList is also in pSecondList or not:

repeat for each line tLine in pFirstList
	if tLine is among the lines of pSecondList then
		put tLine & return after tCommonLines
	end if
end repeat

While array operations typically have some overhead, it turns out that the approach used in this example is much faster than the one illustrated above if the lists are longer than ten lines or so. If the pFirstList is ten lines long, the two approaches are approximately equal (within a factor of two), but as the pFirstList grows in length, the approach using arrays quickly gains a speed advantage.

5 Comments

Richard M Kriesel

The sample code works but is not as exemplary as it should be. It contains two repeat loops which are not necessary. Each repeat loop can be reimplemented as a single split command.

on intersectLists t1, t2
split t1 by cr as set
split t2 by cr as set
intersect t1 with t2
return the keys of t1
end intersectLists

sam norris

Hello,

Yes, that would be a more efficient manner of setting up the arrays and by all accounts it executes twice as fast. I have now substituted the repeat loops with the split command in the lesson.

Dick Kriesel

Thanks for the splits, Sam. But the new code has a variable-naming conflict involving prefix p and prefix t.

Richard M Kriesel

This lesson belongs in section "Text" rather than in section "Graphics and Objects."

Panos Merakos

Hello Dick
Thanks for spotting that - the example is now fixed. Also, we will move the lesson in the appropriate section asap.

Add your comment

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.