Faith At Law LLC

“Reasonable” Liquidated Damages

In Lifeline, Inc. v. Poyner, the Appellate Court of Maryland affirmed the trial court’s decision to award $75,000 in liquidated damages to Christal Poyner, the seller, after Lifeline, Inc., the buyer, failed to complete the purchase of a $2.2 million commercial property.

Key terms of the agreement included:

Lifeline paid a $75,000 deposit to an escrow agent.
The contract, drafted by Lifeline, included a liquidated damages clause:
- If Lifeline defaulted, Poyner’s sole remedy was to retain the deposit.
Lifeline had a “Feasibility Period” to secure financing and walk away with the deposit refunded.
Lifeline missed the deadline without terminating, then defaulted.

To be valid in Maryland, a liquidated damages clause must: (a) specify a clear sum; (b) represent a reasonable estimate of anticipated damages; and (c) be binding and not subject to recalculation later.

The court found the clause valid in this case because:

The $75,000 amount was specific and agreed upon.
It was reasonable (~3.4% of the $2.2M price), especially since Poyner testified to:
- Lost rental opportunities,
- Market withdrawal,
- Ongoing property costs,
- Lost investment income.
The parties expressly agreed the deposit would be the sole remedy, and actual damages would be difficult to calculate.

Lifeline argued the clause was an unenforceable penalty, citing Willard Packaging Co., but the court found that Willard was distinguishable as that case lacked evidence of harm or a rational damage estimate, where the present case included ample evidence of real, even though not precisely quantifiable harm.

The court upheld the clause as a valid, enforceable liquidated damages provision, not a penalty. Lifeline breached the contract by failing to close and not terminating within the allowed timeframe, so Poyner was entitled to keep the deposit.

Satisfied Judgment Cuts off Further Discovery by Creditors

In Batiste v. Davis, (an unreported 2025 Maryland Appellate Court decision) the appellants (Batiste and Roy, the “Creditors”) claimed Gervonta Davis committed a material breach of contract by failing to pay the full $35,000 due under a 2014 Confidential Settlement Agreement, which had released him from a prior 2013 boxing management contract. The Creditors received only $2,000 by the 2016 payment deadline. They argued this breach triggered a clause allowing reinstatement of the original 2013 contract, entitling them to 15% of Davis’s boxing earnings through 2020.

They sued and obtained a default judgment of $390,000, based on known fight earnings during this so-called “second managerial period.” Davis later settled in 2022, agreeing to pay $468,000, covering the judgment and post-judgment costs. The Creditors accepted the money but didn’t record the judgment as satisfied, and continued seeking discovery, allegedly to enforce additional earnings they believed were still owed under the reinstated contract.

The appellate court rejected the Creditors’ argument, holding:

The 2014 settlement was the controlling agreement.
Any breach (e.g., failure to pay $35,000) was remedied by the 2022 settlement, which fully satisfied the 2019 judgment.
Under Maryland Rule 2-626, once the agreed amount was paid, the judgment was satisfied, and no further discovery could be pursued without a new judgment.
The Creditors could not seek post-judgment discovery for a “hypothetical” claim.

Although the initial breach of the 2014 agreement was alleged to be material (i.e., nonpayment of $33,000), the court did not disturb that finding. However, it emphasized that even if that breach had reinstated the old contract, the later 2022 Agreement settled the entire dispute, extinguishing any right to enforce prior agreements unless a new judgment was pursued.

Fraud in the Inducement as a Contract Defense

In the case of Swinton Home Care, LLC v. Lillian Tayman, Swinton, a home health care provider, sued Tayman and her deceased sister’s estate for breach of contract, quantum meruit, and unjust enrichment.

The case arose from a situation where Swinton’s representative convinced Lillian Tayman to sign a contract on behalf of her sister, Mary Tayman, even though she did not have the authority to do so. Mary Tayman paid for the services for a while, then defaulted, and subsequently died.

The Circuit Court for Montgomery County found that Swinton fraudulently induced Lillian Tayman to sign the contract, thereby voiding the contract and eliminating any obligation to guarantee. In Maryland, fraudulent inducement refers to “a situation where a person is induced by some fraudulent representation or pretense to execute the very instrument which is intended to be executed[.]” Meyers v. Murphy, 181 Md. 98, 100 (1942). Fraud must be demonstrated by clear and convincing evidence, which is a higher evidentiary bar than the typical civil action.

The court ruled in favor of the Taymans. On appeal, Swinton challenged the sufficiency of the evidence supporting this decision, but the judgment was affirmed. This case is relevant for clients as it highlights the legal consequences of inducing someone to sign a contract without proper authority, and how such actions can lead to a contract being voided.

Interpreting Settlement Agreement Intention

Case: Adventist Healthcare, Inc. v. Steven S. Behram, No. 16, September Term, 2023.

Issue: The issue revolves around the interpretation of a confidential settlement agreement between a hospital (Adventist Healthcare, Inc.) and a physician (Dr. Steven S. Behram). The hospital was required to submit a report to a regulatory authority using specific language agreed upon in the settlement. However, the physician alleged that the hospital selected codes that contradicted the agreed language in the report, thus breaching the agreement.

Decision: The Supreme Court disagreed with the lower court’s decision to award summary judgment to the hospital. They concluded that a reasonable person in the position of the parties would have understood that the hospital’s obligation to use specific language in its report precluded it from also including contradictory language. Whether the added language constituted a breach was deemed a jury question. The court also agreed with the lower court that the physician’s claim that the hospital failed to provide a timely hearing was released in the settlement agreement.

Relevance: This case is relevant for clients researching how courts interpret contracts, the importance of clear and precise language in settlement agreements, and the potential consequences of deviating from agreed language. The case also underscores the importance of understanding the implications of release clauses, as they can potentially relinquish one’s right to future claims related to the agreement.

Complexities of Competing Litigation Filings

Issue: The case HBC US Propco Holdings, LLC v. Federal Realty Investment Trust concerns whether a Maryland court erred in dismissing a lawsuit on the grounds that it should be heard in another forum, namely Pennsylvania.

Decision: The Appellate Court of Maryland upheld the lower court’s decision to dismiss the lawsuit, citing that it did not abuse its discretion. The court determined that the dispute should proceed in Pennsylvania, as the litigation was about a lease of a retail property located there, governed by Pennsylvania law, and most witnesses were probably residents of Pennsylvania.

Relevance: The ruling is relevant as it provides guidance on the application of the first-to-file rule and its exceptions, including anticipatory filing and forum shopping. The court found that the appellant had knowledge of the appellee’s intention to file suit and filed a preemptive declaratory judgment action in Maryland, which was deemed an improper anticipatory filing. The decision highlights the importance of considering the appropriateness of the forum in relation to the subject matter of the dispute, the convenience of the parties and witnesses, and the potential for conflicting rulings in different jurisdictions.

COVID and Enforceability of Commercial Leases

In the case of SVAP II Pasadena Crossroads LLC v. Fitness International LLC, the court ruled on issues regarding unpaid rent and the impact of the COVID-19 pandemic on commercial lease agreements. Under Maryland law, it was determined that unpaid accrued rent belongs to the landlord at the time of accrual, unless otherwise stated in the agreement.

The court also considered whether the COVID-19 pandemic and subsequent business shutdowns made performance under a commercial lease legally impossible. It was found that this is dependent on the terms of the lease and the specifics of each case. In this instance, the court ruled that the tenant’s failure to pay rent was not excused by the pandemic as the lease contained a force majeure clause, which allocated the risk of business disruption to the tenant.

The landlord was also found not to have breached the lease provision that gave the tenant the right to use the premises for the operation of a health club and fitness facility, as the temporary closure of the tenant’s business was due to government orders, not the landlord’s actions. Furthermore, the COVID-19 pandemic was not considered a “casualty” allowing for abatement of rent.

Numerous cases have been filed by commercial tenants as a result of non-payment of rent during government ordered shutdowns during the pandemic, with mixed results for tenants.

Using R, MySQL, and Python for Educational Research

A Rationale

Perhaps you are a college faculty member and want to know if something you are doing in the classroom is improving student outcomes in your class. Perhaps you are a member of leadership and want some analysis of departmental or college-wide initiatives to evaluate their impact on student learning. Perhaps you have a systems background and are just interested. There are a variety of use cases for building databases and using statistical packages to analyze them.

This paper is intended to provide some ideas and examples of how to accomplish your analytical goals. The specific tools you may ultimately select may be different, but at the least, this paper can help set a framework for how to approach the problem of collecting and analyzing data.

How To Get Started

The first item on your to do list should be a project plan that identifies the goal of your project. For example: “I want to use a linear regression to evaluate the impact of a teaching technique on student learning” or “I want to build a preference score match statistical model to analyze the impact of a change in textbooks on student learning.”

Next, take some time to consider where you might collect data for your analysis. Does your institution provide access to data via a comma separated file (csv)? Can your colleagues contribute data via Excel? Do you have access to a data warehouse on a SQL Server or Oracle database system? Will you need to gather some data manually?

Next, consider what data you need to develop your analysis. For educational research, you may need access to a variety of categorical data to be able to control for natural variation in student outcomes based on student-specific or instructor-specific variations. Alyahyan and Dustegor (2020) identified numerous factors that may correlate with student success, including (a) past student performance such as high school grade point average and/or student grade point average in prior college courses; (b) student demographics such as gender, race, and socioeconomic status; (c) the type of class, semester duration, and program of study; (d) psychological factors of the student such as student interest, stress, anxiety, and motivation; and (e) e-learning data points such as student logins to the LMS and other student LMS activity. You also should give due regard to the sample size you may need to be able to generalize your analysis.

Next, consider what tools you may need for data collection and analysis. This paper discusses several tools that you may wish to use, such as the statistical package, R, a database application, MySQL, and a scripting language that would permit the import and pre-processing of raw data into your research database, Python. You may also need a way to manually collect data, perhaps by implementing a website that connects to your database through a server-side scripting language like php, or by designing a basic data collection instrument in Excel. Other tools for collecting surveys, such as SurveyMonkey or Microsoft Forms, may also be helpful. Many of the above tools are open source and freely available by researchers, but some may not fit your particular computing environment, or they may not work in exactly the same way as discussed in this paper.

Finally, the reader should also take into account whether your planned research may require Institutional Review Board (IRB) approval. More information is available in 34 CFR § 97; some forms of educational research are exempt (such as the research on the “effectiveness of or the comparison among instructional techniques” at 104(d)(1)) from IRB review, where others (such as surveys of students under (d)(2)) may require some consideration for anonymity or limited IRB review.

The Prerequisites

You may or may not have all of the tools described in this paper already available to you. If not, one method to get them is to use Homebrew, a package manager for MacOS and Linux, available here: https://brew.sh/.

Once Homebrew is installed, you can use this tool at the MacOS Terminal shell prompt to install other packages through a simple command line script:

$ brew install python

$ brew install mysql

$ brew install r

Homebrew will then download and install your tools to a specific directory (older Macs install to /usr/local/opt; Silicon-based Macs use /opt/homebrew). Homebrew will then add these functions to the Terminal path so that you can execute commands by just entering:

$ r

$ python [/path/to/your/python/script/goes/here]

Alternately, you can usually download and use a package installer script that will install tools visually and configure certain default options for you. For example, MySQL has a DMG file you can use to install MySQL: https://dev.mysql.com/doc/refman/8.0/en/macos-installation.html. A bit of searching with your favorite web search tool will help you from here.

You also may or may not be comfortable using the Terminal to interact with your operating system, unless you are a fossil like me and worked with MS DOS 3.1 back when personal computing was still a new idea. If you prefer a graphical user interface, you may need some additional tools that will help you to access and/or analyze your data. For example, HeidiSQL gives you a graphical view of your MySQL databases. R can be installed with a partly graphical user interface (though you will still need to create code for the actual data analysis).

In addition, R and Python may both need you to do some further installation of modules each uses for various tasks discussed in this paper. Python modules can be installed using pip:

$ pip install pandas

$ pip install urllib

$ pip install pymysql

$ pip install datetime

Similarly, R has a command line option to install new packages that you may need within R:

install.packages(“ggplot2”)

install.packages(“RMySQL”)

install.packages(“lmer”)

install.packages(“parallel”)

install.packages(“marginaleffects”)

install.packages(“tidyr”)

Onwards to data analysis!

The Big Picture

Before we dive into specifics, let’s talk about the basic process here for using these tools for your research goals.

Once you have defined the data that you need, you will need to gather that data into one place to conduct your analysis. You have some options for getting your data from institutional sources, export from a learning management system, or manually collecting the data into a simple collection tool in Excel. This paper discusses importing from comma separated values files, but there may be other file formats you may encounter. MySQL itself has a way to import .csv and .xml files natively, though that tool may not be sufficiently flexible for processing your raw data. Python is a more flexible alternative for cleaning up and importing data into a database file, though a bit more patience may be needed to develop and test your code.

Once you have your data, you may be able to begin your statistical analysis in R. In some cases, you may need to define views that include formulas or conditional fields before you can move on to R. This will depend on your specific research questions and the condition of the raw data you processed, and how creative you are with your import process.

From within R, you can conduct a variety of statistical and graphical analyses of your data, using a variety of modules available to R users, such as lm, lmer, matchit, and ggplot.

Data Collection

Perhaps the data is collected for you and made available at your institution in a data warehouse, and by design the data set has the variables you need for your research. If that is your situation – congratulations – you can skip over this section (to the envy of the other readers of this article).

However, you may find that either: (a) you have myriad data sources in various file formats that are not available in one system, (b) the data you do have access to may not be complete or you may need to define specific variables and aren’t able to do that with your remote data source, or (c) you wish to have a copy of this data over which you can exercise control. If any of these fit your situation, or you need to gather data by hand, read on, gentle reader!

One approach is to collect your data in .csv files and import these into a MySQL database table using a MySQL’s load data command or a Python script. For example, if you want to research how things went in courses you taught this semester, you might be able to export your grade books from your courses out of your learning management system and into a .csv file. D2L’s Brightspace and Blackboard both provide an option to do this. Some may instead support export to an Excel spreadsheet, which can then be exported into .csv file format from within Excel’s “Save As” function.

In other cases, your starting point may instead be a .pdf file of your data. Python also has other tools that can be used to import data from .pdf files.

Or, you may have to create a data collection instrument yourself. You can use Excel for basic data collection by defining specific columns of information you plan to manually collect for your research project. For example:

In creating your data collection instrument, you can help yourself by having column names in the first row, though Python can manage if you don’t do this, as long as you remember what data you put where!

Data Import

Once you have your raw data, you again have some options for what to do next. My recommendation is that you import all of your data into a MySQL database, but know that you are free to use a different database solution. If you do, be sure to check that it supports Open Database Connectivity (“ODBC”) and that both Python and R have a way to reach this database to insert and manipulate your raw data, and to query the processed data for analysis later.

For this step to work, two things have to be true. One, you need a place to put the data, and two, you need a way to import your raw data to that place. To accomplish the former, you can create a MySQL database, add user permissions, and create a table that will store your data using Structured Query Language (“SQL”). Assuming that your data is organized as above in the prior example, you could connect to your MySQL server, create a new database and a new table to store your data as follows:

mysql -u root -p (after pressing Enter, the terminal will ask for the password to access your database server locally)

create database mydata; (to make your database)

use mydata; (to switch into this database)

CREATE USER ‘your_user_name’@’localhost’ IDENTIFIED BY ‘password (be creative)’; (I know it feels like I am shouting at MySQL, but I learned that from the internet when creating new user accounts)

GRANT ALL PRIVILEGES ON *.* TO ‘your_user_name’@’localhost’; (give your new user account full permissions on your data table; for database administrators cringing in their chairs, you could also be more specific and just grant SELECT permissions if you are really going to be like that with a GRANT SELECT on mydata.* TO ‘your_user_name’@’localhost’;)

create table tbMyData (dataID NOT NULL AUTO_INCREMENT, StudentID int, FinalGrade char(1), Intervention int, GPA decimal(6,2), Race char(1), Gender char(1), Age int, PRIMARY KEY (dataID) );

Now, that wasn’t so bad, right? Also, all those semicolons are not just me trying to fix a bunch of run-on sentences. MySQL uses the semicolon in its syntax to know when you are at the end of your SQL statement and that it should now execute whatever you have entered.

Before you go any further, be sure to put the password you created above somewhere safe. You will need that later so that R can reach your database.

The next step in your process is to import your data into your freshly created database file. You have some choices here. One possibility is to use the MySQL native load data command. This can work if the table you have created above is organized in the same order as the data in your .csv file that you have diligently created earlier from your institutional data sources or you have collected by hand. In order to load data into MySQL, a few things are needed. First, you need to log in to MySQL using the local-infile=1 switch, as:

mysql –local-infile=1 -u root -p

Next, you need to tell use your database where you wish to import your data, and then tell MySQL you want to load local data into the database with this command:

set GLOBAL local_infile = ‘ON’;

These safety precautions are meant to keep you from harming your innocent data with careless data import processes; for the more data cavalier, these are unjust restrictions on your data import liberties. Either way, you have to tell MySQL you are going to load some data or nothing will happen with the next command in MySQL:

load data LOCAL infile ‘/path/to/your/data/Import.csv’ into table tbMyData FIELDS TERMINATED BY ‘,’ LINES TERMINATED BY ‘\n’ IGNORE 1 ROWS;

The load data command has a variety of options but the obvious here is to take a file you have provided the path to and import it into a table called tbMyData. The remainder of the syntax here is to tell load data the field delimiter (in my case that is a comma, but you could use a pipe or semicolon or anything else you want if you control the file export process; “comma separated values” files assume a comma is the field delimiter), the row delimiter (the “\n” here means a carriage return), and IGNORE 1 ROWS tells load data that your first row has row headers and is not the data itself. You won’t need the latter if you didn’t define any row headers because you are living on the data-edge and prefer to rely on your memory that column 39 contained the student’s Age, while column 14 was the final grade in the course.

Load data won’t work in all situations. In fact, I have created just such a situation in this paper because I created a primary key field as the first field in my table above, but the sample .csv file I created starts instead with the StudentID.

Even if I didn’t manufacture a reason to turn to Python, my experience is that when you are getting data from other sources, these files will contain data that you don’t need to import, or data that needs to be manipulated before you can insert it into your database. Python scripting is a flexible way to address some of these kinds of problems. Before we go on, you might want to grab a cup of coffee or walk the dog, especially for those with command line anxiety syndrome!

Import with Python to MySQL

Python is a flexible scripting language that can reliably import data for your project into your MySQL database. There are some basic parts to the script below that I have broken into sections to help make the script easier to follow. When working in this language, be mindful of your indenting as this tells Python whether a particular block of code belongs with a particular condition, function, loop, or other control structure in your code. Also, comments start with the hash tag, #.

Here are the basic building blocks of code for this project:

Modules Imported into your Python Script
Defined functions that begin with “def”
Database connection variables to reach your MySQL database
Identify the .csv file you wish to process
Loop through each row of the .csv file and process each column of data, then
Insert a record using SQL into your database and commit the record

Ready? Here WE GO!

The Modules you need:

MySQL is weird about the format of dates for insert statements:

I know this is a shock, but sometimes your raw data has weird characters that will cause problems for your code. This function will try to strip out weird stuff so your database doesn’t choke on malformed insert statements further on.

The next portion of the code will define your connection with pandas to your MySQL database:

The remainder of the code will then open a connection to the database, connect through pandas to the .csv file you are importing, fill in default values in fields where data may be missing, and then loop through each row of your file and insert the values into the MySQL database using a constructed SQL statement and executing it with the open MySQL cursor object.

For data junkies that are opposed to header rows on principle, you can also refer to a particular column by its number, starting with the number 0 to mean the first column (obviously – who would start at 1 for the first column?): row[‘0’]

Once you have created your script, save it with a name you can remember, and the extension .py so that everyone will know that this is a python script. Then you can execute it by using a command at the terminal like so:

python /path/to/your/script/script_name.py

And, if we are all living right, python will find your .csv file, roll through each and every row in the file, and insert each one into your database file.

For the rest of us that are living in the sin that is our reality, there may be a problem and python will tell you what row offends its sense of natural order for further troubleshooting. Python tends to get mad at you for having too many or too few indents for blocks of code, and for syntax errors (the period (.) concatenates strings in some languages, whereas plus (+) concatenates in Python). Stay calm and troubleshoot with a cup of coffee and a dose of patience. If you have a particular step with an error, using the print() command to print out information at the command line while the script runs is a method to help troubleshoot errors (such as SQL syntax errors, print(query)).

Now, the above script has to be modified for each file you want to import, which can be a pain if you have many files to process. You can probably find a way to loop through all of the .csv files you want to process in a particular directory with Python and the os module. Extra credit if you can find a method to move processed files out of your import directory using shutil.

So, did your data import? You can check by going over to your MySQL database and executing a query like:

select * from tbMyData;

If things are not as you had hoped, never fear! You can always truncate table tbMydata; and re-run your python script after making adjustments to get things as you need them to be.

Using R for Data Analysis

Well, you have made it this far, kind reader, and I trust that the data journey has treated you well. Your final destination is within your grasp and your dreams of data analysis are on the cusp of being realized! Let’s get started with R!

At the terminal, anti-climatically type the letter R and press enter, and behold!

$ r

If all went well, R awaits your command (albeit, with ABSOLUTELY NO WARRANTY).

What happens next is entirely determined by your research questions. R is flexible. Numerous people far smarter than I have developed all sorts of statistical packages that you can import into R for analysis.

For some projects, a linear regression or a multi-variable linear regression model may be the tool that you need. R supports this through the lm and glm functions.

For other projects, your research may require a Preference Score Match approach. R supports this through the popular matchit library. Multi-level modeling is also supported through lme4. My guess is that if there is a statistical tool you need, someone has developed one for R that is available for import into your installation.

To get started with a particular package, be sure to install it in R as above in the Pre-requisites section. To use that package in a particular R session, use the library() command to load that package for your analysis:

library(RMySQL)

library(marginaleffects)

library(cobalt)

library(MatchIt)

library(lme4)

library(rbounds)

library(parallel)

library(ggplot2)

If you are missing a dependency, you can install that package if R doesn’t do it for you automatically with the install.packages command. Also, R is case sensitive, so RMySQL is not the same as rmysql.

From here, there are some steps you can take to analyze your data as follows:

Connect to your MySQL database using the RMySQL library.
Create a query object that R can use for analysis, passing a query for data to your database, and fetch the results into an object in R.
Plot your data in various ways using ggplot to Quartz and export the charts you like into a .png file.
Execute various analyses of your data using various statistical tools that meet your needs and report the results.

Here are some examples of commands you can execute in R for your analytical pleasure. Extensive documentation on options and switches are available from any web browser connected to the internet.

This first set of commands will connect you using the RMySQL package to your MySQL database. The data set that results from your query will reside in the mydata object:

conn=dbConnect(RMySQL::MySQL(),dbname=’MyData’,host=’localhost’,port=3306,user=’your_username_here’,password=’your_password’)

grades=dbSendQuery(conn, “select * from tbMyData”)

mydata=fetch(grades, n=-1)

You can create a scatterplot of your data if you want to compare two variables to look for relationships between them, such as student GPA and final course grade:

ggplot(d, aes(x=GPA, y=FinalGrade)) + geom_point() + geom_smooth(method = “lm”, se = FALSE) + labs(x=”GPA”, y=”Final Grade”, title=”Final Grades by GPA”)

You can export charts that you like from Quartz to an image file on your computer using the ggsave command:

ggsave(filename = “/path/to/save/your/image/figure1.png”, plot = figure1, width = 8, height = 11, dpi = 600)

You can create linear regressions of your data using lm. The basic syntax here is lm ( Dependent Variable ~ Independent Variables here separated by the plus (+) symbol, data=mydata)

m0=lm(FinalGrade~GPA+Inter+Race+Gender+Age, data=mydata)

summary(m0)

The summary command will then display a table of the results of your linear regression, giving you Residuals, Coefficients, Residual Error, Multiple and Adjusted R-squared, and F-statistic. Significant results are denoted with *** which suggests a very low probability of a chance relationship between the dependent and independent variable. Negative estimates suggest a lower FinalGrade for those variables, whereas positive estimates suggest a higher FinalGrade for those associated variables.

For the above analysis to work, some data manipulation is required to convert strings into numbers. For example, the Final Letter Grade could be converted into a number using a formula of A=4, B=3, C=2, D=1, FW=0, Male = 1 else 0, Race=’B’ then 1 else 0 in MySQL to build a view of your underlying data table.

create view vwMyData as select studentID, case when finalgrade=’A’ then 4 when finalgrade=’B’ then 3 when finalgrade=’C’ then 2 when finalgrade=’D’ then 1 Else 0 END as CodedGrade, Inter, GPA, case when Race=’B’ then 1 else 0 END as isAA, case when gender=’M’ then 1 else 0 END as isMale, Age from mngt140data;

Then, you can reload your data into R, using the view as your source rather than the underlying data table:

grades=dbSendQuery(conn, “select * from vwMyData”)

mydata=fetch(grades, n=-1)

Depending on the scope of your project, you may instead want to build a Preference Score Match-based (“PSM”) model for analysis of your data. For educational research, PSM may result in a more accurate analysis of instructional treatments because the matching methodology ensures that similar students are included in both your control and treatment groups, resulting in a more accurate Average Treatment Effect on the Treated (“ATE”) in the study.

Rosenbaum and Rubin (1983) originally developed the Propensity Score as expressed in the following formula: e_i = P r (Z_i = 1|X_i), where e_i is the preference score of the individual, i, X_i is a vector of features or characteristics for individual i, and Z_i is a binary variable indicating whether or not individual i is a match. The purpose of calculating a Propensity Score is to create a similar treatment and control group so that the distribution of known covariates is similar between the two groups (Austin, 2011). (“Thus, in a set of subjects all of whom have the same propensity score, the distribution of observed baseline covariates will be the same between the treated and untreated subjects.”) Several other researchers have discussed the use of PSM and MatchIt in R, including Griefer, 2022; Ho, et al., 2011; Zhao, et al., 2021; and Fischer, et al., 2015.

R supports a variety of PSM methodologies. This paper will focus on using MatchIt. One of the key issues for PSM is collecting relevant independent variables that may have a non-chance relationship to the dependent variable. Alyahyan & Dustegor’s research identified a substantial number of independent variables that may predict student performance based on the research of others (Alyahyan & Dustegor, 2020). As a practical matter, not all of these variables may be available to conduct matching, or the research sample may be too small to use all variables to match control and treatment units effectively. Some discretion must be exercised by the researcher to develop a meaningful and proper PSM model from the data available.

The syntax to build a PSM model is:

m0.out<-matchit(CodedGrade~Inter+GPA+isMale+isAA+Age, data=mydata, distance=’glm’, method=’exact’)

summary(m0.out)

plot(summary(m0.out))

MatchIt supports a variety of methods for matching control and treatment groups and which are discussed in more detail by Zhao, et al., 2021. In the example above, I have used the “exact” method, which requires that a control and treatment unit must essentially be identical to each other to be included in the match data. As a practical matter, this method may exclude too many treated units to result in a fair match, which is why the researcher should consider several possible matching methods for their analysis.

The plot of the summary of this model is a Love plot, which will help visually assess whether the model is appropriately balanced across all the independent variables used for matching. Generally, a balanced model would have an absolute standardized mean difference of less than 0.1 for all variables used to create the model.

If the model seems appropriately balanced, the next command creates a match object:

mdata0 <- match.data(m0.out)

From here, you can create a linear regression fit object, which can be fed into comparisons to calculate the ATT:

fit0 <- lm(CodedGrade ~ Inter+GPA+isMale+isAA+Age, data=mdata0, weights=weights)

comp0 <- comparisons(fit0, variables=”CodedGrade”, vcov=TRUE, newdata=subset(mdata0,CodedGrade==1), wts=”weights”)

summary(comp0)

The summary command of the comparisons object will give you a calculation of the ATT:

In this case, the ATT would be 0.0323 with a relatively low p value suggesting in our made up example that the treatment was highly correlated with an increase in student final grades when controlling for other variables included in the match object.

Parting Thoughts

The database administrator in me feels the need to recommend you backup your data stored in MySQL just in case you have a data disaster and need to recover from a checkpoint. A simple script you can run in Terminal will create a backup file of your MySQL database. You can also copy your python and R scripts to a cloud-based storage file just in case your computer fails you.

If you have created more than one database, you can add them in the DB_LIST variable separated by a comma, and the do loop will backup each one for you. Extra credit if you can find a way to schedule this script to run on a regular basis so you don’t have to think about backing up your data manually!

You can restore your databases from these backup files using the following MySQL command:

mysql -u username -p database_name < backup_file.sql

In sum, these open source tools can be employed to effectively analyze data for all sorts of interesting research questions. Happy querying!

FURTHER RESEARCH

Alyahyan, Eyman, Dustegor, Dilek (2020) Predicting academic success in higher education: literature review and best practices, 17 (3). https://doi.org/10.1186/s41239-020-0177-7

Austin, Peter C. (2011) An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies. Multivariate Behavioral Research 46 (3): 399–424. https://doi.org/10.1080/00273171.2011.568786.

Fischer, L., Hilton, J., Robinson, T.J. (2015) A multi-institutional study of the impact of open textbook adoption on the learning outcomes of post-secondary students. Journal of Computing in Higher Education, 27, 159-172. https://doi.org/10.1007/s12528-015-9101-x

Griefer, Noah. (2022) MatchIt: Getting Started. https://cran.r-project.org/web/packages/MatchIt/vignettes/MatchIt.html#assessing-the-quality-of-matches

Ho, Daniel, Imai, Kosuke, King, Gary, Stuart, Elizabeth A. (2011) MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. https://r.iq.harvard.edu/docs/matchit/2.4-20/matchit.pdf

Rosenbaum, P. R., and Rubin, D.B. (1983). The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, 70(1), pp. 41-55. https://doi.org/10.1093/biomet/70.1.41

Zhao, Qin-Yu, Luo, Jing-Chao, Su, Ying, Zhang, Yi-Jie, Tu, Guo-Wei, and Luo, Zhe (2021) Propensity score matching with R: conventional methods and new features. Annals of Translational Medicine, 9(9), 812. https://atm.amegroups.com/article/view/61857

AI Generated Art and Copyright Infringement

Artificial Intelligence is coming to art near you! And with AI’s entrance into the field through systems like Stable Diffusion, DALL-E, and Midjourney, copyright infringement lawsuits are the next wave as we try to sort out the technology and the rights of persons affected by the technology.

The use of AI in generating art online has become increasingly popular. Many artists and designers are using AI tools to create new and unique pieces of art, from digital paintings to animations and more. The problem is how these AI systems learn how to respond to text prompts from system users.

One approach to training AI systems is to use images scraped from the internet – regardless of the copyright status of the image – and to use this library of images to inform the algorithm on how to interpret a text prompt from a system user. So, to simplify, if a system user enters a text message such as “please make an image of a cow standing in a green field on a rolling hill,” the AI algorithm would need to have reference material of what a cow, green field, and rolling hill looks like. The reference images that are indexed in relation to these words and phrases help the AI to learn how to respond to the text prompt and generate a new image.

Figure 1 – DALL-E generated image of a “cow standing in a green field on a rolling hill”

The problem comes in with the images that are used by the AI in its original training, as some aspect of these images must be re-used by the AI to respond to a user text prompt. If the copyright owner of the source image has not dedicated the image to the public domain, or otherwise licensed the work openly, an AI that works based on combinations of these protected works may infringe a copyright right (such as the exclusive right of duplication, derivative work, etc.) under section 106 of the Copyright Act.

And several lawsuits along these lines have been filed, challenging the use by AI companies of these original source images. For example, Anderson v. Stability AI is a putative class action lawsuit brought by artists alleging infringement against Stability and several other AI image generation businesses. Getty Images has also filed a similar type of infringement suit. In Getty’s case, the images it owns have a proprietary watermark that permits the company to track unlicensed use of works, and alleges that “millions” of its images have been used by companies like Stability AI without license or consideration.

This leads to the unanswered question of whether the use of these protected works is infringing, and if it is, whether the use could be a defensible fair use under section 107. A key question for the source works harvested from internet sources is whether they are sufficiently original to be protected. The bar of originality through court decisions here is relatively low. Feist v. Rural Tele., 499 U.S. 340 (1991). Absent slavish copying of an original, most photographs will have some originality, at least as to photographer choices about composition and lighting, though the copyright interest in photographs of common objects and scenery must be relatively narrow, which is another way of saying that an infringing work must be nearly identical for liability to accrue. However, there really is no fixed formula of originality of all photographs and the degree of similarity for a subsequent work to be infringing. Such issues require a judgment on a case-by-case basis. For example, are Warhol prints based on a photograph of the late musician, Prince, sufficiently different to not be a copy of the protectible aspects of the photograph? The Second Circuit judged them to not be sufficiently different and to also not be sufficiently transformative to be protected under fair use. Even a casual reading of the literature on fair use would find the wide range of decisions and the absence of simple rules in this area. Andy Warhol Foundation for the Visual Arts v. Goldsmith, 992 F.3d 99 (2d Cir. 2021) (oral arguments heard at US Supreme Court Oct. 12, 2022).

Another problem of duplication involves the doctrine of scenes a faire, which essentially is the idea that genres or subject matters of works will necessarily share some common ground. Feist, 499 U.S. at 340. All works of penguins will share commonality in the “penguinness” of the birds pictured, therefore my photograph of penguins cannot prevent all others from taking their own photo of that bird at the zoo or in the wild. Teaching AI what image relates to the word “penguin” necessarily means showing the AI many penguin images for it to construct some generalized model of “penguin.” Therefore, copyright holders would not be able to prevent AI from generating images of penguins, even images that are based on images made by others, as all pictures of penguins necessarily include aspects of the bird’s form, markings, and size that make it a penguin rather than other birds or animals.

Another unresolved issue with this situation is the idea of sampling. The Ninth Circuit discussed several sampling cases – where a small portion of a work is used by another such that the average observer would not notice the appropriation – and explained that such sampling is generally not actionable infringement. Bell v. Wilmott Storage, 12 F.4^th 1065 (9^th Cir. 2021) (discussing Newton v. Diamond, 388 F.3d 1189 (9^th Cir. 2004). A court could view the AI use of copyright materials of other as a mere sampling of that protected material, and therefore not actionable if the AI were to stitch together hundreds of pieces of digital files to render a response to a user-supplied text prompt.

A final fair use issue has to do with Google’s admitted copying of millions of paper books and indexing of those books into Google’s index of these works, without permission or license from the copyright holders. The Second Circuit ultimately affirmed the finding of fair use in this case, partly on the grounds that works that were still within copyright, while indexed in the database, only appear to searchers in “snippet” form that displays a limited portion of the relevant book page with the search result. Author’s Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015). In addition, many courts will examine whether the use of the plaintiff’s work is “transformative,” though this determination often can be viewed as subjective. For example, the second circuit did not find Andy Warhol’s use of Goldsmith’s photograph of Prince as transformative, but the same circuit has found numerous other appropriationist artist’s use of another’s work as transformative, such as in Cariou v. Prince, but not in Rogers v. Koons, yet confoundingly Koons’ use of a different photograph in Blanch v. Koons was sufficiently transformative. Andy Warhol, supra; Cariou v. Prince, 714 F.3d 694 (2d Cir. 2013); Rogers v. Koons, 960 F.2d 301 (1992); Blanch v. Koons, 467 F.3d 244 (2d Cir. 2006). Perhaps a court could view AI’s use of photographs scraped from the internet as sufficiently transformative, as AI generates a “new” image based on the user text prompt from relevant reference material.

While the source materials are photographs that are published on the internet, a court would most likely view them as within the core protection of the Copyright Act, so this factor would likely favor photographers whose works have been used by AI to generate images. Cariou, 714 F.3d at 710.

The third factor, the amount and substantiality of the protected work used by AI to generate the subsequent image to the AI user, is less certain as to which party is favored. Id. We do not really know how much of the scraped images the AI actually uses in the finished product it displays to users of these various systems, or whether the AI has used more than is necessary to accomplish its transformative purpose.

Finally, courts would have to examine whether the AI’s use “takes away from”usurps” the existing market for the original photographs. Id. at 709. The market for AI-generated art is relatively new. DALL-E does have a paid subscription level for image generation beyond a minimum number of free images; whereas Midjourney is a subscription-based system that starts at $96 per year. What remains to be determined is whether these paid systems reduce the market value of works in Getty Images’ database. For example, are AI users merely generating images with the AI system that they would otherwise just license from Getty Images? While this may seem unlikely from a cursory use of the AI systems, what evidence offered in court of market impact may support either party’s contentions here. However, if the AI systems are generating income, the door is open for these systems to enter into some licensing system with the works they have used to go “legitimate.”

Stay tuned for what’s next and further market disruption by AI systems available and yet to come to market.

Arbitration Clauses in Maryland

Arbitration Clauses in Contracts

An arbitration clause is a provision in a contract that requires disputes to be resolved through arbitration rather than in court. The general purpose of arbitration clauses is to provide a more efficient and cost-effective way for parties to resolve disputes. There is a strong federal and Maryland public policy that favors arbitration where parties have agreed in advance to resolve disputes in that manner, as this reduces the load on the civil court system, and also respects the voluntary, private choices of persons and businesses.

Benefits for companies can include:

Lower costs than litigating in court, as arbitration proceedings generally do not involve appeals of decisions made by the arbitrator.
More control over the process and the outcome.
Confidentiality of the proceedings and the outcome. Unlike civil court proceedings, where the public has a right to access non-privileged records and information about court proceedings are available through online resources such as the Maryland Judiciary Case Search, arbitrations are confidential by default.
The possibility of a more favorable outcome, as arbitrations preclude jury trials that may be more favorable to consumers, and also preclude class action lawsuits against businesses. Instead, aggrieved consumers must litigate individually in an arbitration proceeding.

Benefits for consumers can include:

Potentially lower costs than litigating in court, though consumers need to take into account the cost of the arbitrator, which generally must be shared with the other party and must be paid in advance to file an arbitration.
A faster resolution than in court. Arbitrations can be resolved on average in less than a year, where actions in state courts like the Maryland circuit court may take years (https://go.adr.org/impactsofdelay.html)

However, there are also costs and drawbacks to consider:

Consumers may have less protection under arbitration than in court, especially if they are not familiar with the process or do not have the resources to hire an attorney. Generally, arbitrations are governed by different rules of procedure than civil cases, which in some cases may limit rights to obtain discovery, or limit other litigation steps.
Consumers may be limited in their ability to participate in class action lawsuits, which can provide more leverage in disputes, and also permit pooling of small claims for resolution making the class action more cost-effective for consumers in some cases.
Companies may use arbitration clauses to limit consumer rights and protections.

The Maryland federal district court recently decided a case involving a broad, mandatory arbitration clause in an employment agreement between a car dealership and a former manager. Brown v. Brown’s Md. Motors, Inc. (U.S.D.C. June 10, 2022). In that case, the plaintiff had entered into an employment agreement that contained a broad arbitration clause, and was subsequently terminated from his employment. The plaintiff sought to have his day in court against his employer, but his employer filed a motion to compel arbitration. In response, the court analyzed plaintiff’s claims as to the enforceability of the arbitration clause, applying Maryland contract law to the dispute. Such defenses can include: (a) claims that the agreement was not entered into or was otherwise defective, (b) applicable contract defenses like unconscionability, or (c) in this case, claims that arbitration would prevent the plaintiff from vindicating important federal rights. Ultimately, the court dismissed these claims and ordered arbitration.

If you are considering an arbitration agreement, schedule a consultation on Zoom to review the details before you enter into the agreement.

Is AI Coming for Your Law Job?

Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think and learn like humans. AI systems can be trained to perform tasks such as recognizing speech, understanding natural language, and making decisions.

Legal research is a primary target for an AI intervention, as current legal research tools are fragmented and require advanced knowledge by the human user to identify relevant authority. These tools can also be time consuming, as the human user often still must sift through search results to evaluate if the case, statute, or other authority is most relevant to the assigned research task.

Google had made some strides in this area of research when it published Google Scholar. This research tool incorporated publicly available cases across all US jurisdictions and indexed them. This research tool could be more effective in taking natural language search phrases and finding relevant results, though it lacks other research tools to validate that cases or statutes remain in force.

Another key issue is that different research databases may return different results, and therefore a careful researcher may have to consult multiple databases, including official paper-based reporters, to get a complete picture on the applicable legal authority.

AI has been used in the practice of law in several ways, including:

Legal research: AI-powered legal research tools can analyze large amounts of legal text and help lawyers quickly find relevant case law and statutes. For example, LexisNexis has a “brief analysis” product within Lexis+ that utilizes AI to quickly summarize cases for legal researchers. Brief Analysis in Lexis+
Contract review: AI-powered contract review tools can assist lawyers in analyzing and summarizing the key terms and provisions of contracts. A variety of vendors offer solutions in this area, such as Foley & Lardner LLP, LinkSquares, and Klarity.
Litigation Support & eDiscovery: AI-powered predictive coding tools can help lawyers identify relevant documents in large sets of data, such as during discovery in litigation, though debate continues as to whether AI lives up to the marketing hype surrounding litigation support. Law.Com (2022)
Chatbots: AI-powered chatbots can assist lawyers in answering frequently asked legal questions and guiding clients through legal procedures.

In the court system, AI has been used for tasks such as:

Sentence classification: AI-powered tools can assist judges in determining appropriate sentences for defendants based on factors such as criminal history and the nature of the crime, though such use is not without concerns about bias and due process. Hillman, Noel (2019)
Predictive policing: AI-powered predictive policing tools can assist law enforcement in identifying areas where crimes are more likely to occur and allocating resources accordingly, though such tools may also lead to claims of racial and ethnic bias. Verma, Pranshu (2022) Washington Post

Overall, AI is being used increasingly in the legal system to analyze data and make predictions, but it is important to note that the implementation and use of AI in the legal system is still in its infancy, and there are concerns regarding bias and accountability. AI holds the promise of improving knowledge worker productivity, for example, to aid a human knowledge worker in more quickly identifying relevant authority and summarizing it for the human knowledge worker. Important limitations on AI remain, however. For example:

Complex legal reasoning: The legal system is complex and requires a deep understanding of the law and its application. AI systems may struggle to fully grasp the nuances of legal reasoning, making human lawyers more effective in this area.
Communication and negotiation: The legal system requires human interaction and communication, as well as the ability to negotiate and come to agreements. AI systems may not be able to fully replicate the emotional intelligence and interpersonal skills of human lawyers.
Ethical considerations: The legal system involves many ethical considerations, such as the protection of individual rights, which may be difficult for AI systems to fully comprehend. More generally, concerns exist about the ethical use and implementation of AI (Garibay, Ozlem, et al. (2023))
Creativity: There are cases where creative thinking is required to find solutions, AI systems may not have the ability to think creatively.
Understanding of social, cultural and economic context: In the legal industry, understanding the broader context of laws, regulations, societies and cultures is important in order to make informed decisions.

However, with the rise of ChatGPT and the substantial reported investment in this platform by companies like Microsoft (Browne, Ryan (2023) CNBC) suggests that 2023 will likely be a year to watch for AI in a job near you!