Discussion:
Form validation: add space to postcode
(too old to reply)
s***@hotmail.com
2010-10-13 07:07:42 UTC
Permalink
A postcode in the U.K. normally has 1 or 2 characters, followed by 1
or 2 numbers, a space, a number and two characters. Validating a form
value to see it matches this criteria is easy (using regular
expressions) and documented all over the web.

However, I have not been able to trace a method of inserting a space
in the correct place if the user has not done so. For example, if the
user types "me12tr", changing it to upper case is easy, so we have
"ME12TR", but I would like in this instance to automatically insert a
space between the 1 and the 2, without having to prompt them to do so.

The start of U.K. postcodes vary (M1, CT2, SW1A are all valid), but
there seems always to be one digit and two characters after the space,
at the end of the postcode. So is there a way in Javascript I can
insert a space between the third and fourth characters *counting from
the right* of the entered string, to correct a postcode entered
without the required space?

Thank you for any advice you can give.

Steve
RobG
2010-10-14 11:28:05 UTC
Permalink
Post by s***@hotmail.com
A postcode in the U.K. normally has 1 or 2 characters, followed by 1
or 2 numbers, a space, a number and two characters.  Validating a form
value to see it matches this criteria is easy (using regular
expressions) and documented all over the web.
However, I have not been able to trace a method of inserting a space
in the correct place if the user has not done so.  For example, if the
user types "me12tr", changing it to upper case is easy, so we have
"ME12TR", but I would like in this instance to automatically insert a
space between the 1 and the 2, without having to prompt them to do so.
The start of U.K. postcodes vary (M1, CT2, SW1A are all valid), but
there seems always to be one digit and two characters after the space,
at the end of the postcode.  So is there a way in Javascript I can
insert a space between the third and fourth characters *counting from
the right* of the entered string, to correct a postcode entered
without the required space?
I would write a function that gets the length of the string passed to
it, then use String.prototype.substring() to create a string that was
composed of the parts from 0 up to but not including length-3, a
space, then from length-3 to the end.

But that might be just me. :-)


--
Rob
Denis McMahon
2010-10-14 12:21:07 UTC
Permalink
Post by s***@hotmail.com
A postcode in the U.K. normally has 1 or 2 characters, followed by 1
or 2 numbers, a space, a number and two characters. Validating a form
value to see it matches this criteria is easy (using regular
expressions) and documented all over the web.
However, I have not been able to trace a method of inserting a space
in the correct place if the user has not done so. For example, if the
user types "me12tr", changing it to upper case is easy, so we have
"ME12TR", but I would like in this instance to automatically insert a
space between the 1 and the 2, without having to prompt them to do so.
The start of U.K. postcodes vary (M1, CT2, SW1A are all valid), but
there seems always to be one digit and two characters after the space,
at the end of the postcode. So is there a way in Javascript I can
insert a space between the third and fourth characters *counting from
the right* of the entered string, to correct a postcode entered
without the required space?
Thank you for any advice you can give.
Validate the postcode. One or two letters, followed by a digit, followed
by an optional letter or digit, followed possibly by 1 or more spaces,
followed by a digit and two letters.

The trick I think is that the second part of the code always following a
standard format, and allowing for the multitude of first part formats.

Then test if the postcode contains a space. If it does, split off the
last 3 characters.

Finally uppercase it.

/*
function returns empty string or formatted postcode
*/

function validatePostcode(pcode)
{
var patt = /^[a-z]{1,2}[0-9][0-9a-z]? ?[0-9][a-z]{2}$/i;
var test = patt.test(pcode);
if (!test) return ""; // didn't validate
patt = / /;
test = patt.test(pcode);
if (!test) // no space
{
var pcl = pcode.length;
var pc1 = pcode.substring(0, pcl - 4);
var pc2 = pcode.substring(pcl - 3, pcl - 1);
pcode = pc1 + " " + pc2;
}
pcode.toUpperCase();
return pcode;
}

I haven't tried the function, and I'm sure someone will improve on it.
Perhaps to clean away any leading or trailing space, for example.

Rgds

Denis McMahon
s***@hotmail.com
2010-10-14 14:08:57 UTC
Permalink
Thank you, Rob and Denis, for your suggestions and it has certainly
given me something to work on. Obviously Substring is the function I
need to concentrate on! I think, Denis, your function is more or less
complete as it is.

I am interested, Denis, in your technique in the first few lines:

1 var patt = /^[a-z]{1,2}[0-9][0-9a-z]? ?[0-9][a-z]{2}$/i;
2 var test = patt.test(pcode);
3 if (!test) return ""; // didn't validate
4 patt = / /;
5 test = patt.test(pcode);

I understand the first line is loading patt with the regexp for the
postcode. Does the ? ? mean "return as true whether there is a space
in there or not"?

I've always used .search or .match when using regular expressions.
Line 2 seems an interesting alternative.

What's the difference between patt=" " and patt=/ /? Or does
Javascript automatically recognise the use of slashes to denote the
boundaries of a regexp? And is the point of those two lines to test
for the presence of a space character within the string pcode? If so,
it seems a very efficient way to go about it!

Steve
Denis McMahon
2010-10-14 14:57:52 UTC
Permalink
Post by s***@hotmail.com
Thank you, Rob and Denis, for your suggestions and it has certainly
given me something to work on. Obviously Substring is the function I
need to concentrate on! I think, Denis, your function is more or less
complete as it is.
1 var patt = /^[a-z]{1,2}[0-9][0-9a-z]? ?[0-9][a-z]{2}$/i;
2 var test = patt.test(pcode);
3 if (!test) return ""; // didn't validate
4 patt = / /;
5 test = patt.test(pcode);
I understand the first line is loading patt with the regexp for the
postcode. Does the ? ? mean "return as true whether there is a space
in there or not"?
? always refers to the preceding (when read from the left) character or
expression

/^[a-z]{1,2}[0-9][0-9a-z]? ?[0-9][a-z]{2}$/i

'/' = start of regex pattern
'^' = match start of string
'[a-z]{1,2}' = 1 or 2 alpha characters
'[0-9]' = a digit
'[0-9a-z]?' = optionally a digit character or alpha character
' ?' = optionally a space
'[0-9]' = a digit character
'[a-z]{2}' = 2 alpha characters
'$' = end of string
'/' = end of regex pattern
'i' = ignore case
Post by s***@hotmail.com
I've always used .search or .match when using regular expressions.
Line 2 seems an interesting alternative.
regex.test(string) returns true or false
string.search(regex) returns the position, you could test on result
being == -1 for failure
string.match returns an array of matches or null, you could test on
result being null for failure

But I just find that testing for true or false is simpler if that's all
you need to know
Post by s***@hotmail.com
What's the difference between patt=" " and patt=/ /? Or does
Javascript automatically recognise the use of slashes to denote the
boundaries of a regexp? And is the point of those two lines to test
for the presence of a space character within the string pcode? If so,
it seems a very efficient way to go about it!
First test validates the format of the postcode, whether it has a space
in it or not.

Note that it validates the format, not the data. The expression will
validate bad data in the correct format. If you want to validate the
whole postcode, you need to lookup against a database of valid
postcodes. There's a dataset you can download (after jumping through
some hoops) on the Ordnance Survey website that has every UK postcode
and its national grid co-ordinates.

Second test determines if the space is present or not, and then if it's
not present, then the code inserts it.

If you only want to determine if a space is present, you only need one
test. But, I felt that it would be better, if you were going to try and
correct the entered postcode, to make sure that the data fitted the
defined format first.

Rgds

Denis McMahon
s***@hotmail.com
2010-10-14 15:26:09 UTC
Permalink
Thank you for the time you have taken to comprehensively answer my
query. I see that your regexp is more flexible than the one I was
using before, as yours checks for London postcodes like SW1A 4AA (the
extra letter after the first number).

I shall bookmark this thread and refer to it in future - it's the best
blow-by-blow explanation of an expression I have seen so far!

Steve
John G Harris
2010-10-15 15:41:35 UTC
Permalink
Post by s***@hotmail.com
Thank you for the time you have taken to comprehensively answer my
query. I see that your regexp is more flexible than the one I was
using before, as yours checks for London postcodes like SW1A 4AA (the
extra letter after the first number).
I shall bookmark this thread and refer to it in future - it's the best
blow-by-blow explanation of an expression I have seen so far!
As a matter of interest, did the solution have to use a RegExp ?

John
--
John Harris
Denis McMahon
2010-10-15 22:52:48 UTC
Permalink
Post by John G Harris
Post by s***@hotmail.com
Thank you for the time you have taken to comprehensively answer my
query. I see that your regexp is more flexible than the one I was
using before, as yours checks for London postcodes like SW1A 4AA (the
extra letter after the first number).
I shall bookmark this thread and refer to it in future - it's the best
blow-by-blow explanation of an expression I have seen so far!
As a matter of interest, did the solution have to use a RegExp ?
Probably not, it was just how I chose to do it.

I'm sure it's possible to solve without regex.

Rgds

Denis McMahon
Denis McMahon
2010-10-19 16:09:47 UTC
Permalink
Post by Denis McMahon
Post by John G Harris
Post by s***@hotmail.com
Thank you for the time you have taken to comprehensively answer my
query. I see that your regexp is more flexible than the one I was
using before, as yours checks for London postcodes like SW1A 4AA (the
extra letter after the first number).
I shall bookmark this thread and refer to it in future - it's the best
blow-by-blow explanation of an expression I have seen so far!
As a matter of interest, did the solution have to use a RegExp ?
Probably not, it was just how I chose to do it.
I'm sure it's possible to solve without regex.
Hmm, if he just wants to format it, the best approach may be remove any
spaces, insert a single space before the last 3 characters, make sure
all the letters are upper case.

function formatPostcode(pcode)
{
pcode.replace(/ */g,"");
pcode.replace(/([a-z0-9]{2,4})([0-9a-z]{3})$/i;
,"$1 $2");
pcode.toUpperCase();
return pcode;
}

Lots of ways to skin this cat.

Rgds

Denis McMahon
Evertjan.
2010-10-19 18:00:47 UTC
Permalink
Post by Denis McMahon
pcode.replace(/ */g,"");
Wrong as the result would be thrown away.

Wrong in the sense this would needlesly replace each location of zero
spaces [so between 2 nonspace characters] with an empty string.

Try:

pcode = pcode.replace(/ +/g,'');

or even better as perhaps the whitespace included a tab:

pcode = pcode.replace(/\s+/g,'');
Post by Denis McMahon
pcode.replace(/([a-z0-9]{2,4})([0-9a-z]{3})$/i;
,"$1 $2");
Again wrong as the result would be thrown away.

If this is only about standard British postcodes,
try:

pcode = pcode.replace(/(\d[a-z]{2})$/i,' $1');
Post by Denis McMahon
pcode.toUpperCase();
Again wrong as the result would be thrown away.

pcode = pcode.toUpperCase();
Post by Denis McMahon
return pcode;
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Dr J R Stockton
2010-10-20 19:26:33 UTC
Permalink
Post by Denis McMahon
Hmm, if he just wants to format it, the best approach may be remove any
spaces, insert a single space before the last 3 characters, make sure
all the letters are upper case.
Yes, and there is no need to check what the last three characters are.
If the result may be considered electronically, rather than merely by
eye, it could be well to also replace those characters which are not,
because of visual ambiguity, used in postcodes with their legitimate
visual equivalents.

Recall that aged typists may habitually type l to mean 1; and
incompetent ones may hit o instead of 0.
--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk DOS 3.3 6.20 ; WinXP.
Web <http://www.merlyn.demon.co.uk/> - FAQqish topics, acronyms & links.
PAS EXE TXT ZIP via <http://www.merlyn.demon.co.uk/programs/00index.htm>
My DOS <http://www.merlyn.demon.co.uk/batfiles.htm> - also batprogs.htm.
Dr J R Stockton
2010-10-15 17:56:55 UTC
Permalink
Post by Denis McMahon
Note that it validates the format, not the data. The expression will
validate bad data in the correct format. If you want to validate the
whole postcode, you need to lookup against a database of valid
postcodes. There's a dataset you can download (after jumping through
some hoops) on the Ordnance Survey website that has every UK postcode
and its national grid co-ordinates.
One can validate the pattern, more or less; and one can check that only
the allowed letters are used.

But, since a false INVALID is a sever error, one should not validate
against a copy of the OS database unless that copy is going to be
refreshed sufficiently frequently.


It is clear that whitespace has no meaning in UK postcodes. If the aim
is to generate the same meaning in standardised layout, all whitespace
should therefore first be removed, and any non alphanumeric is then a
definite error. If a single space is needed (before the last three
characters) a RegExp replace as posted earlier must be the simplest way,
for those who know all (or at least sufficient) of ECMA 262.

It may, however, be useful to pad the first part with spaces to a length
of maybe 5 characters, and then add the last 3, generating a fixed-
length string which tabulates neatly.

Tmp = In.replace(/\s+/g, "")
Out = Tmp.replace(/(...)$/, " ".substring(0, 7-Tmp.length)+" $1")

which inserts at least one space, and more if wanted. The second
argument of the replace can instead use a call of a pre-existing space-
generating routine.
--
(c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05.
Website <http://www.merlyn.demon.co.uk/> - w. FAQish topics, links, acronyms
PAS EXE etc. : <http://www.merlyn.demon.co.uk/programs/> - see in 00index.htm
Dates - miscdate.htm estrdate.htm js-dates.htm pas-time.htm critdate.htm etc.
Dr J R Stockton
2010-10-14 18:00:55 UTC
Permalink
In comp.lang.javascript message <345645a2-71e0-46e3-9775-***@c1
3g2000vbr.googlegroups.com>, Wed, 13 Oct 2010 00:07:42,
Post by s***@hotmail.com
A postcode in the U.K. normally has 1 or 2 characters, followed by 1
or 2 numbers, a space, a number and two characters. Validating a form
value to see it matches this criteria is easy (using regular
expressions) and documented all over the web.
However, I have not been able to trace a method of inserting a space
in the correct place if the user has not done so. For example, if the
user types "me12tr", changing it to upper case is easy, so we have
"ME12TR", but I would like in this instance to automatically insert a
space between the 1 and the 2, without having to prompt them to do so.
The start of U.K. postcodes vary (M1, CT2, SW1A are all valid), but
there seems always to be one digit and two characters after the space,
at the end of the postcode. So is there a way in Javascript I can
insert a space between the third and fourth characters *counting from
the right* of the entered string, to correct a postcode entered
without the required space?
Thank you for any advice you can give.
(1) "criteria" is the plural of "criterion".

(2) Yes, but there are other possibilities (on a liberal but sometimes
necessary interpretation of "UK").

(3) The space goes before the last occurrence of "digit letter".

(4) Read <http://en.wikipedia.org/wiki/Uk_postcodes>.

(5) <http://www.merlyn.demon.co.uk/js-order.htm#UKPC> may be of
interest. Easy sorting needs different spacing rules.

(6) Good = Bad.replace(/ /g, "").replace(/(...)$/, " $1") ; fails for
mail to Anguilla, and BFPO. I gather.
--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, and links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (RFCs 5536/7)
Do not Mail News to me. Before a reply, quote with ">" or "> " (RFCs 5536/7)
s***@hotmail.com
2010-10-15 08:05:30 UTC
Permalink
Thank you for that, John. From the Wikipedia article, the only real
problem I can see is validating the Girobank's postcode (GIR 0AA) as
it has no digit in the 2nd or 3rd position. For my own purposes, the
postcodes round where I live are always LLN(N) NLL, so the previous
response is more than adequate.

Yes, you're right it should be criterion. Good job I wasn't talking
about an item of data as well. Or worse, the criteria for an item of
data...!

Steve
RobG
2010-10-15 09:28:50 UTC
Permalink
Thank you for that, John.  From the Wikipedia article, the only real
problem I can see is validating the Girobank's postcode (GIR 0AA) as
it has no digit in the 2nd or 3rd position.  For my own purposes, the
postcodes round where I live are always LLN(N) NLL, so the previous
response is more than adequate.
Yes, you're right it should be criterion.
You could also have written "these criteria".
 Good job I wasn't talking
about an item of data as well.
I think in modern usage at least, "data" can be used for either
singular or plural, though some might say that is not strictly
correct. The singular "datum" is usually used in a quite different
context, with a meaning similar to "benchmark".

BTW, you originally said that you didn't need to validate the
postcode, only insert a space. If all you do is validate the format
(noting that you tolerate one type of error in the format but not
others), you still don't know if it's a valid post code, so what have
you gained?

Presumably you will check at some point against a list of valid codes,
so whether the format is right or wrong, the same check must be made.


--
Rob
s***@hotmail.com
2010-10-15 10:12:43 UTC
Permalink
Post by RobG
Presumably you will check at some point against a list of valid codes,
so whether the format is right or wrong, the same check must be made.
Well, I already use a regular expression to check correct postcode
pattern (at least for local codes). The postcodes will feature in
residents' surveys for my employer, the local council. So most of the
time, the format will be predictable - LLN NLL.

We don't check the validity of postcodes in terms of whether that
particular combination exists. As Denis said above, this would entail
linking in to some external database to look up the postcode for
validity. I really just want to stop people typing in just the first
3 characters of their postcode, when we really want the whole thing.
There's nothing to stop them just plain lying, I suppose.

Someone already pointed out to me that my thinking is a little
backwards - I should be removing the space from postcodes typed in,
not looking for ways of inserting one, by virtue of the redundancy of
the space character in a postcode anyway. But the space character
does help with further processing of the postcode areas - it is easier
to split off the last three characters in Excel if there is a space
present, rather than resorting to formulae or macros.

Latinate plurals are a sticky point: technically, the plural of forum
is fora, not forums; referenda not referendums, for example. But in
everyday use people get confused by terms like "residents' fora" so we
just use forums instead.

Steve
Dr J R Stockton
2010-10-16 16:54:34 UTC
Permalink
In comp.lang.javascript message <fe0c0828-b818-48f7-82cb-***@28
g2000yqm.googlegroups.com>, Fri, 15 Oct 2010 01:05:30,
Post by s***@hotmail.com
Thank you for that, John. From the Wikipedia article, the only real
problem I can see is validating the Girobank's postcode (GIR 0AA) as
it has no digit in the 2nd or 3rd position. For my own purposes, the
postcodes round where I live are always LLN(N) NLL, so the previous
response is more than adequate.
Once you have a sufficiently good way of validating the usual patterns
which certainly accepts NO invalid codes (no need to care about the
characters which the Royal Mail consider indistinguishable), then those
which are rejected at that stage can be processed by some slower means -
a lookup table of known exceptions, matching against other RegExps,
asking a human, ....

If your input is on a form, you'll know where the postcode ought to be.
If it is free-format, you may need to consider the possibility of it not
being last - it may be followed by UK GB Wales etc.

Perhaps you could add something which partially validates UK telephone
numbers by rejecting 0207 ddd dddd and 0208 ddd dddd !


Your council probably should have access to BS 7666.

<http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastan
dards/address/postcode.aspx>.
--
(c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05.
Website <http://www.merlyn.demon.co.uk/> - w. FAQish topics, links, acronyms
PAS EXE etc. : <http://www.merlyn.demon.co.uk/programs/> - see in 00index.htm
Dates - miscdate.htm estrdate.htm js-dates.htm pas-time.htm critdate.htm etc.
axlq
2010-10-18 17:38:41 UTC
Permalink
Post by s***@hotmail.com
A postcode in the U.K. normally has 1 or 2 characters, followed by 1
or 2 numbers, a space, a number and two characters. Validating a form
value to see it matches this criteria is easy (using regular
expressions) and documented all over the web.
However, I have not been able to trace a method of inserting a space
in the correct place if the user has not done so. For example, if the
user types "me12tr", changing it to upper case is easy, so we have
"ME12TR", but I would like in this instance to automatically insert a
space between the 1 and the 2, without having to prompt them to do so.
The start of U.K. postcodes vary (M1, CT2, SW1A are all valid), but
there seems always to be one digit and two characters after the space,
at the end of the postcode. So is there a way in Javascript I can
insert a space between the third and fourth characters *counting from
the right* of the entered string, to correct a postcode entered
without the required space?
I have had a similar problem but in the end I had to question: why?

I ask this from the perspective of someone who is developing a
web application that needs to validate and store addresses from
locations worldwide in a database.

After reading this thread, I am scratching my head wondering why
bother including the space at all, why not simply remove all
spaces? From my perspective, that would make the postal code
consistent with much of the rest of the world, and I wouldn't have
to worry about whether local customs put a space in it or not.

If I need to geolocate an address, the postal code works quite well
without the space character. Google Maps, for example, recognizes
ME12TR as a postal code in Rochester, UK.

I'm curious about the rationale for including the space character.

-A
s***@hotmail.com
2010-10-19 06:28:00 UTC
Permalink
Post by axlq
I'm curious about the rationale for including the space character.
Well, in my case, it is because the data collected will eventually be
extracted to Excel and the people processing it after that find it
easier to split the postcode into areas (the first part of the
postcode) to analyse in Excel if a space character is present.

Perhaps a formula could be used to split the final three characters
from the string, but I guess people are just used to seeing a postcode
with a space in the middle!

You are correct that the space performs no function other than aid
readability for humans. In this respect, I have seen a fad for
splitting up phone numbers into groups of three these days, e.g. 04561
454 897, to aid readability and reduce human errors. Or even the
French way: they do their phone numbers in groups of two: 04 76 96 12
32, and pronounce them that way too, rather than individual digits.

Steve
Dr J R Stockton
2010-10-19 19:19:23 UTC
Permalink
Post by axlq
After reading this thread, I am scratching my head wondering why
bother including the space at all, why not simply remove all
spaces? From my perspective, that would make the postal code
consistent with much of the rest of the world, and I wouldn't have
to worry about whether local customs put a space in it or not.
Among others, Guernsey, Canada, UK and Netherlands postal codes include
a space. Japan, and I think others, use some form of dash for the same
purpose
Post by axlq
If I need to geolocate an address, the postal code works quite well
without the space character. Google Maps, for example, recognizes
ME12TR as a postal code in Rochester, UK.
Indeed. But the main purpose of the code is to assist postal delivery;
and, where they have specified a space they will presumably like to have
a space.

The space adds no meaning, so it can safely be removed for processing
provided that, where required, it is correctly inserted when presented
or printed for postal purposes.
--
(c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05.
Website <http://www.merlyn.demon.co.uk/> - w. FAQish topics, links, acronyms
PAS EXE etc. : <http://www.merlyn.demon.co.uk/programs/> - see in 00index.htm
Dates - miscdate.htm estrdate.htm js-dates.htm pas-time.htm critdate.htm etc.
Loading...