[ACCEPTED]-What are design patterns to support custom fields in an application?-data-modeling

Accepted answer
Score: 15

I do agree with posters below that Options 10 3, 4, or 5 are most likely to be appropriate. However, each 9 of your suggested implementations has its 8 benefits and costs. I'd suggest choosing 7 one by matching it to your specific requirements. For 6 example:

  1. Option 1 pros: Fast to implement. Allows DB actions on custom fields (searching, sorting.)
    Option 1 cons: Custom fields are generic, so no strongly-typed fields. Database table is inefficient, size-wise with many extraneous fields that will never be used. Number of custom fields allowed needs to be anticipated.
  2. Option 2 pros: Fast to implement. Flexible, allowing arbitrary number and type of custom fields.
    Option 2 cons: No DB actions possible on custom fields. This is best if all you need to do is display the custom fields, later, or do minor manipulations of the data only on a per-Customer basis.
  3. Option 3 pros: Both flexible and efficient. DB actions can be performed, but the data is normalized somewhat to reduce wasted space. I agree with unknown (google)'s suggestion that you add an additional column that can be used to specify type or source information. Option 3 cons: Slight increase in development time and complexity of your queries, but there really aren't too many cons, here.
  4. Option 4 is the same as Option 3, except that your typed data can be operated on at the DB level. The addition of type information to the link table in Option 3 allows you to do more operations at our application level, but the DB won't be able to do comparisons or sort, for example. The choice between 3 and 4 depends on this requirement.
  5. Option 5 is the same as 3 or 4, but with even more flexibility to apply the solution to many different tables. The cost in this case will be that the size of this table will grow much larger. If you are doing many expensive join operations to get to your custom fields, this solution may not scale well.

P.S. As noted below, the term "design 5 pattern" usually refers to object-oriented 4 programming. You're looking for a solution 3 to a database design problem, which means 2 that most advice regarding design patterns 1 won't be applicable.

Score: 11

As far as the application code is concerned 33 I'm unsure. I do know that custom fields 32 benefit greatly from a EAV model in the database.

Per 31 the comments below, the most significant 30 mistake you can make with this model is 29 putting foreign keys into it. Never ever 28 put something like FriendID or TypeID into 27 this model. Use this model in conjunction 26 with the typical relational model and keep 25 foreign key fields in table columns as they 24 should.

A second significant mistake is 23 placing data in this model that needs to 22 be reported with every element. For example 21 putting something like Username in this 20 model would mean that anytime you want to 19 access a user and need to know their username 18 you've committed yourself to a join at best 17 or 2n queries where n is the number of users 16 you're looking at. When you consider that 15 you are usually going to need the Username 14 property for every User element it becomes 13 obvious this too should remain in the table 12 columns.

However, if you're just using this 11 model with custom user fields you'll be 10 fine. I can't imagine many situations where 9 a user would be entering in relational data 8 and the EAV model is not too significantly 7 detrimental to searches.

Lastly, don't try 6 to join data from this and get a nice pretty 5 recordset. Grab the original record and 4 then grab the set of records for the entity. If 3 you find yourself tempted to join the tables 2 you've probably made the second mistake 1 as mentioned above.

Score: 5

If you're developing with an object oriented 36 language, we're talking about adaptive object models here. There 35 are quite a few articles written about how 34 you can implement them in oo-languages, but 33 not so much information about how to design 32 the data store side.

In the company where 31 I work, we have solved the problem by using 30 a relational database to store AOM data. We 29 have central entity table for presenting 28 all the different "entities" in 27 the domain, like people, network devices, companies, etc... We 26 store the actual "form fields" to 25 data tables that are typed, so we have one 24 table for strings, one for dates and so 23 on. All the data tables have a foreign key 22 pointing to the entity table. We also need 21 tables to present the type-side, i.e. what 20 kind of attributes (form fields) can certain 19 entity have and this information is used 18 to interpret the data in data tables.

Pros 17 of our solution are that anything can be 16 modeled without code changes, including 15 references between entities, multivalues 14 and so on. It's also possible to add business 13 rules and validations to fields and they 12 can be reused in all form. Cons are that 11 the programming model is not very easy to 10 understand and query performance will be 9 worse than with a more typical DB design. Some 8 other solution than relational database 7 could have been better and easier for AOM.

Building 6 a good AOM with a working data store for 5 it is a lot of work and I wouldn't recommend 4 it if you don't have highly skilled developers. Maybe 3 one day there will be an OS solution for 2 these kinds of requirements.

Custom fields 1 have been discussed before in SO:

Score: 3

Something like Option 3 is the way to go 17 and i have used this method previously. Create 16 a single table to define additional properties 15 and their corresponding values. This would 14 be a 1-N relationship between your Customer 13 and CustomerCustomField table (respectively). Your 12 second question regarding defining relationships 11 with custom properties would be something 10 to think about. The first thing that comes 9 to mind is adding a DataSource field, which 8 would contain the table to which the property 7 value is bound to. So essentially your CustomerCustomField 6 would look like:

  1. CustomerId
  2. Property
  3. Value
  4. ValueDataSource (nullable)

This should allow you to 5 either bind to a specific data structure 4 or simply allow you to specify unbound values. You 3 can further normalize this model, but something 2 like this could work and should be easy 1 enough to handle in code.

Score: 3

Option 4 or 5 would be my choice. If your 14 data is important, I wouldn't go tossing 13 away your type information with Option 3. (You 12 might try to implement full type-checking 11 yourself, but it's a pretty big job, and 10 the database engine already does it for 9 you.)

Some thoughts:

  • Make sure your CustomFields has a DataType column.
    • Use a UDF-based check constraint on CustomFieldValues to ensure that the column specified by CustomFields.DataType is non-null.
    • You'll also want a standard check constraint to make sure you have exactly one non-null value.
  • Regarding foreign keys, I would model these as a separate DataType.
    • Each potential cross-table reference would require its own column. This is good, because it maintains referential integrity.
    • You would have to support these relationships in application code anyway, so the fact that they are hard-coded in the database does not actually limit functionality.
    • This will also jive well with your ORM, if you're using one.
  • For Option 5, use intermediary tables to model the relationships.
    • You would still have a CustomerCustomFieldValue, but instead with only CustomerID and CustomFieldValueID columns.
  • Think long and hard about your constraints every step of the way. This is tricky stuff, and one misstep can cause utter havok down the line.

I am using this in an 8 application currently in development. There 7 haven't been any problems yet, but EAV designs 6 still scare the daylights out of me. Just 5 be careful.

As an aside, XML may also be 4 a good choice. I don't know as much about 3 it from direct experience, but it was one 2 of the options I considered when starting 1 the data design, and it looked pretty promising.

Score: 0

if those 'extra' fields are incidental and 6 don't care to do searches on them, I usually 5 go for option 2 (but like JSON better than 4 XML). If there's going to be searches on 3 custom fields, option 3 isn't hard to do, and 2 usually the SQL optimizer can get reasonable 1 performance out of it.

Score: 0

I am currently working on a project with 14 this same problem, and I have chosen to 13 use option 3, but I added a FieldType field 12 and a ListSource field in case the FieldType="list". The 11 ListSource field could be a query, an sql 10 view, a function name, or something that 9 results in a list of options for the list. The 8 biggest problem with trying to store fields 7 like this in my situation is that this field 6 list can change, and the users are allowed 5 to edit the data later. So what to do if 4 the field list has changed and they go to 3 edit. My solution to that scenario was to 2 allow editing only if the list hasn't changed 1 and to display read-only data if it has.

Score: 0

The table does not matter, imagine a table 85 for stations. The fields of this can go to a form. Obviously 84 this form is static, since the fields are 83 fixed in the model.

What a CCK does (it's 82 an old Drupal 6 module) is create other 81 tables that relate extra fields to one or more 80 tables. To do this, 3 tables are created 79 called:

  • table
  • attributes
  • values

You can put a prefix on them if you 78 do not want to touch your current model.

Note: The 77 antipattern is called the Entity–Attribute–Value model (EAV). It is 76 not recommended to use in large-scale databases 75 due to the delay in obtaining the associated 74 records. Or, if that is the case, at least 73 use on tables that do not require frequent 72 queries and that these tables are not important 71 within your model.

In table goes a list of table 70 names to which you will allow extra fields 69 and an id. Here only goes the id and name (which 68 is the name of the table you want to expand). If 67 you want to give a table the possibility 66 of defining new fields, you just have to 65 add a record here.

In attributes fields called id, name, type and 64 tables_id are created. The name is the field name that 63 would be associated in the new version of 62 the table. And in type is associated the type 61 of value that it uses (integer, string, date, time, etc). The 60 data types depend on the DB engine, but 59 you can start with only some that the engine 58 allows. Only there will be no relations 57 or indexes for these fields since they are 56 optional. table_id is associated with the id of the 55 record used in table (table.id).

In values fields are written 54 with all the possible types that you have 53 defined for attributes. For example, if 52 you only allow integers and strings, you 51 only have to create two fields, one for 50 integer and one for string. It is important 49 to note that the default values of these 48 types will always be null. Together with two 47 other fields, one, let's call it attributes_id, associated 46 with the attributes record that will indicate the 45 name and type of field that you must use 44 to store the value (that is, the field of 43 this table according to what the attributes 42 table defines). And another one called objective_id that 41 points to the record inside your final table 40 (let's say a record in stations).

Stations

id name lat long
1 Alto Hospicio -20.290467 -70.100192
2 Gobernación -22.093082 -70.201210
3 Hospital el Cobre -22.450496 -68.908442

Table

id name
10 stations
20 fooTable

Attributes

id name type tables_id
100 key integer 10
200 email string 20
300 start date 10

Values

objective_id attributes_id integer date string
1 100 117 NULL NULL
1 300 NULL 2022-01-01 NULL
3 100 217 NULL NULL

Now, in your view 39 with the form, you can check whether or 38 not to add a new form field according to 37 whether or not there is a record associated 36 with that table in the attributes table (you know 35 in this view that the final table is stations by 34 which is just doing a SELECT of all the 33 records associated with that table name). Depending 32 on its type, you can add validations. When 31 you store or save this field you create 30 or update it in the values table and write the 29 value provided by your form in the appropriate 28 field, the id of your record in the final 27 table (stations) is copied to objective_id. To be dynamic 26 you must incorporate this generation in 25 all the forms associated with a table. Or 24 at least in those that you previously define 23 in tables.

The same analysis to present the values 22 in a report. You iterate in the attributes table in 21 case there is an attribute to your report 20 (stations) and for each record obtained 19 you consult the values table filtering by 18 attributes_id and objective_id (which is the id of the current station 17 to report).

Stations report table (view)

name lat long key start
Alto Hospicio -20.290467 -70.100192 117 2022-01-01
Gobernación -22.093082 -70.201210
Hospital el Cobre -22.450496 -68.908442 217

You can delete the values directly 16 in the values table, but you cannot delete records 15 from the tables table or attributes. This to avoid future 14 inconsistencies. To simplify the latter, you 13 can opt for logical deletions by adding 12 date attributes like deleted_at both of this tables.

To 11 prevent inconsistencies you must create 10 a unique constraint that prevents associated 9 duplicate records.

  • table: name
  • attributes: (name, tables_id)
  • values: (objective_id, attributes_id)

To add custom validations 8 you could add a field in the attributes table with 7 a regular expression that administrators 6 can configure. They should contain default 5 values depending on the data type. This 4 is optional.

Attributes

id name type tables_id regex
100 key integer 10 ^\d+$
200 email string 20 [^@ \t\r\n]+@[^@ \t\r\n]+\.[^@ \t\r\n]+
300 start date 10 ^\d{4}-([0]\d|1[0-2])-([0-2]\d|3[01])$

Obviously this applies to many 3 uses and facilitates the growth of models 2 by the user. Against it is the lack of relationships 1 that will simplify queries.

Check this considerations for EAV.

More Related questions